<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Content Simplicity</title>
	<atom:link href="https://contentsimplicity.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://contentsimplicity.com/</link>
	<description>Content. Simplified.</description>
	<lastBuildDate>Wed, 10 Feb 2021 18:18:50 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/cropped-lightbulb_icon-2.png?fit=32%2C32&#038;ssl=1</url>
	<title>Content Simplicity</title>
	<link>https://contentsimplicity.com/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">159993585</site>	<item>
		<title>How to Write and Publish Articles That Get Noticed</title>
		<link>https://contentsimplicity.com/how-to-write-and-publish-articles-that-get-noticed/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-write-and-publish-articles-that-get-noticed&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-write-and-publish-articles-that-get-noticed</link>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Mon, 02 Nov 2020 13:21:00 +0000</pubDate>
				<category><![CDATA[writiing]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[content marketing]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[publish]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[sharing]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[visibility]]></category>
		<category><![CDATA[write]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=735</guid>

					<description><![CDATA[<p>Everyone wants to know how they can get their articles noticed in the endless expanse that is the internet. Most people think that there must be some kind of trick to it</p>
<p>The post <a href="https://contentsimplicity.com/how-to-write-and-publish-articles-that-get-noticed/">How to Write and Publish Articles That Get Noticed</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"><em style="font-size: 14px;">Image by Ralf Kunze from Pixabay</em></p>



<p class="wp-block-paragraph"><em>This article first appeared in <a href="https://medium.com/swlh/how-to-write-and-publish-articles-that-get-noticed-60e9701daed4" target="_blank" rel="noreferrer noopener" aria-label="Towards Data Science (opens in a new tab)">Towards Data Science</a></em></p>



<h3 class="wp-block-heading">Simple techniques for creating content that’s easy to find and exciting to read</h3>



<p class="wp-block-paragraph"><em>Want to learn how to write, publish, and get noticed?</em></p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">I get a lot of views on Medium. About 100,000 every 30 days. As of April, I had been writing for four months and only wrote three to five articles each month.</p>



<p class="wp-block-paragraph">It’s pretty exciting.</p>



<p class="wp-block-paragraph">(Update: Thanks to all&nbsp;<span style="font-size: 14px;">of you amazing human beings, I reached 100K readers in April!!!)</span></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2ACLHVKuiSd9nIU49C4RTFhw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph"><em>April-May 2019</em></p>



<p class="wp-block-paragraph">Everyone wants to know how they can write, publish, and get their articles noticed in the endless expanse that is the internet. Whether you’re running a business, launching a new product, or putting everything you have into a blog, you want your content seen.</p>



<p class="wp-block-paragraph"><br>Most people think that there must be some kind of trick to it. There isn’t! You don’t need to be a part of a team that writes hundreds of articles a day. You don’t need to pay for views or hack any systems. There are a ton of simple and free things that you can do right now to make your content stand out and get noticed.</p>



<p class="wp-block-paragraph"><br>Just remember that you need to do what works for you! My posts might be different than yours and my goals might be different than yours. The joy for me is in sharing the cool stuff I know with as many people out in the world as possible. You might want something different.</p>



<p class="wp-block-paragraph">The internet is full of some “common knowledge” information that people who write and publish swear by. These include things like:</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow">
<ul class="wp-block-list"><li>Write shorter articles. Ones that take 6–8 minutes to read are ideal.</li><li>Publish frequently.</li><li>Publish on weekdays.</li><li>Find a great featured image.</li><li>Keep your paragraphs short.</li></ul>
</div></div>



<p class="wp-block-paragraph">These are great tips! But as you try what “everyone” says is effective, remember to always pay attention to what works best for you as you write, publish, and move forward.</p>



<p class="wp-block-paragraph">You might be surprised.</p>



<p class="wp-block-paragraph">Every couple of weeks I tend to write one 15- to 19-minute piece and publish it on a Saturday. That’s pretty much it.<br>You do you, boo.</p>



<p class="wp-block-paragraph"><strong>Publish on Medium</strong><br>There’s a good chance that you’re already doing this. But if you’re out there blogging all alone and wondering if anyone will ever notice your amazing work, republish your content on Medium! Medium has somewhere in the neighborhood of a gazillion views every month. Take advantage of this when you write and publish your work! You can easily import your content from your existing blog or website and Google will not punish you for it.<br>Importing your content is incredibly simple. Just click on your profile picture in the top-right corner, go to stories, click on “Import a story,” paste in your URL, and you’re basically done. The directions are right here and it’s crazy easy. Your original source will automatically be referenced by a canonical URL and both Google and your SEO will be happy.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2ABYLdjHXTOpEmqPI416x6rw.jpeg?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph"><em>Image by <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/users/1426260-1426260/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=947739" target="_blank" rel="noopener noreferrer" data-href="https://pixabay.com/users/1426260-1426260/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=947739">Eden Ware</a> from&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=947739" target="_blank" rel="noopener noreferrer" data-href="https://pixabay.com/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=947739">Pixabay</a></em></p>



<p class="wp-block-paragraph"><strong>Content is key</strong></p>



<p class="wp-block-paragraph">This is critical. I know everybody always says this part, but it’s important and I’m saying again. Write, publish, and share something that you care about and take your time with it. Put your heart and soul into it and then load it up with fun visuals.</p>



<p class="wp-block-paragraph">Now spend as much time editing that thing as you did writing it.</p>



<p class="wp-block-paragraph">Get Grammarly. The free version is great. It will edit as you go, saving you hours of effort and anxiety. Run your post through Hemingway App too. You want to write at an 8th-grade level or below. 6th grade seems to be the sweet spot for my articles. Hemingway will help you easily determine the reading level. This is not about dumbing your pieces down. I’m a top writer in artificial intelligence and technology and write articles at about a 6th-grade reading level.</p>



<p class="wp-block-paragraph">Now spend as much time working on the title of your post as you did writing and editing your post. Seriously. The title can make or break you. You can look at headline analyzers (people seem to like CoSchedule) or just pay attention to which articles you actually click and read throughout your day.</p>



<p class="wp-block-paragraph"><strong>Write out your title and then Google it.</strong> Look at the results.</p>



<p class="wp-block-paragraph">If you were looking for exactly the information that you just wrote about, would you click on your title first?</p>



<p class="wp-block-paragraph">Go hit that “Ready to publish?” button at the top of your screen to see what your article will look like. Would you click on that? Does it say what you want it to say? Does it accurately represent your content? (You’d be surprised at how easy it is to forget that part in the quest to be funny, clever, and/or attention-grabbing.) Did you include a power word? You don’t have to, but people do like them. Did you go too far and turn it into clickbait? Medium readers and curators generally don’t like clickbait, so it’s best if you avoid that. How does your featured image look? Is it interesting and exciting.</p>



<p class="wp-block-paragraph">While it does make sense to use a featured image that works with your article, you don’t need to find an image that literally represents the content that you’ve written. Find an image that evokes an emotion that works with what you’ve written. Find an engaging image that makes someone want to get more information. That can be even more powerful than a literal representation of your content.</p>



<p class="wp-block-paragraph">If you want page views outside of Medium, try Googling the main words in your title. Do you get a zillion hits? Are you ready to compete with that? It’s tempting to want to use keywords that get billions of links, but are you sure you can rank there? If you’re hoping people will find your article, the last thing you want is to end up on page 2,824,716 of a Google search.</p>



<p class="wp-block-paragraph">They say if you’re anywhere past page two of a Google search, your article may as well not exist.</p>



<p class="wp-block-paragraph">If you’re using a keyword tool, I’d suggest that you want to stay in the middle of the road. You’re looking for keywords that a lot of people are looking for, but not ones that absolutely everyone is writing about.<br>The most important thing to keep in mind is that you are joining thousands of other people who are putting their hearts and souls into their pieces and then tossing them into the vast, gaping void that is the internet.<br>Your job is to help people find what you’ve written. Make something that’s bright and shiny and then treat links like breadcrumbs along the way to finding your post. (We’ll get to links in a minute.)</p>



<p class="wp-block-paragraph"><strong>Images, images, images. And GIFs!</strong><br>Whitespace is your friend. You want short paragraphs with lots of whitespace. You want visuals. Get some good pictures! Medium offers Unsplash images inside of your post. Just click the plus sign on a new line and then the magnifying glass icon and you’ll have access to thousands of images. All you have to do is search for an image that makes you happy.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2A8Kuh3by_09Gg98EvcB3djA.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Click the plus&nbsp;sign</p>



<figure class="wp-block-image graf graf--figure graf-after--figure"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2ArBgst0Kj6v7WSqzUV5CGOA.png?w=1080&#038;ssl=1" alt=""/><figcaption>Click the magnifying glass to grab an image from Unsplash (or the camera for one of your saved&nbsp;images)
<p>&nbsp;</p>
</figcaption></figure>



<figure class="wp-block-image graf graf--figure graf-after--figure"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2AGc9GIMMaJzILELDqdSNZCQ.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">And search away!</p>



<p class="wp-block-paragraph">If you want to step it up a little, check out&nbsp;<a href="https://pixabay.com/" target="_blank" rel="noreferrer noopener">Pixabay</a>,&nbsp;<a href="https://www.pexels.com/" target="_blank" rel="noreferrer noopener">Pexels</a>, and any other free (or paid) sites. Take it to the next level and grab a GIF from&nbsp;<a href="https://giphy.com/" target="_blank" rel="noreferrer noopener">GIPHY</a>! You can find one you want, click on the little link icon, and grab the GIF link. Then come back to Medium, paste that link on its own line and hit “return.” Wait for a second or two (or twenty…), and your GIF will appear like magic.</p>



<figure class="wp-block-image graf graf--figure graf-after--p"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2AQdn3vvEEErklM2jZ8eM1JA.png?w=1080&#038;ssl=1" alt=""/><figcaption>Click the link icon on the GIF you&nbsp;like</figcaption></figure>



<figure class="wp-block-image graf graf--figure graf-after--figure"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2AccGMFD-xToY3dam7C4WwMw.png?w=1080&#038;ssl=1" alt=""/><figcaption> </figcaption></figure>



<p class="wp-block-paragraph"><em>(Paste the link on its own line in Medium and hit&nbsp;Return.)</em></p>



<p class="wp-block-paragraph">Always make sure you put credits and links to the places where you found your article. If you didn’t take the picture, then use the caption space below the image link to the spot where you found it. If you don’t have the right to use it, then don’t use it. </p>



<h4 class="wp-block-heading">Link to&nbsp;yourself</h4>



<p class="wp-block-paragraph">How often have you been wandering aimlessly around the internet and found an article you liked reasonably well? Fairly often, right? Do you then go hunting for more articles by the same writer? How often do you take the time to go and search for a writer’s profile or website?</p>



<p class="wp-block-paragraph">…basically never?</p>



<p class="wp-block-paragraph">What if they had a link to another one of their articles right there for you to click?</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>I’d guess that people are approximately one million times more likely to click a link to a related story than to go hunting for more pieces from a writer they stumbled across online.</p></blockquote>



<p class="wp-block-paragraph">Pay attention to how <strong>you</strong> interact with articles and stories. If you’ve been on Medium for a while, you might be used to looking for people’s profiles, but what about when you were new here?</p>



<p class="wp-block-paragraph">Search for something on the internet and pay attention to what you do. Do you click on clickbait titles or do you avoid them? Do you read through big walls of text, or do you like short paragraphs and interesting pictures? Every time you like something that you’ve read, do you take a bunch of time to hunt for the writer?</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>We all like to believe that we’re completely unique, but the reality is that a lot of other people out there will behave almost exactly the way you do across the board. Pay attention to what you do!</p></blockquote>



<p class="wp-block-paragraph">Make life easier for your readers and they’ll almost certainly make life better for you.</p>



<h4 class="wp-block-heading">Choose your tags&nbsp;wisely</h4>



<p class="wp-block-paragraph">You can use up to five tags when you publish your article. Use them all! Medium tells you right there how many followers those tags have. Some do better than others for views and interaction. Some have more followers than others. I like to choose one or two really big tags and three medium-sized ones. Keep in mind that the more people use a particular tag for their pieces, the faster your article in that tag might be buried.</p>



<p class="wp-block-paragraph">Choosing tags that are big but not enormous works really well when you’re starting out.</p>



<p class="wp-block-paragraph">If you want to be seen, you probably want to submit your work to a publication. That makes all the difference. You can check out <a href="https://www.smedian.com/pubs" rel="noreferrer noopener" target="_blank">Smedian</a> for top publications, but it can be hard to have your pieces approved and published in the large publications when you’re new to this. Don’t be afraid to start with a smaller publication. Give it a shot!</p>



<p class="wp-block-paragraph">When you have great, well-edited content with images and a great title with an eye-catching featured image and solid tags go ahead and publish it.</p>



<p class="wp-block-paragraph">Congratulations!</p>



<p class="wp-block-paragraph">Are you done?</p>



<p class="wp-block-paragraph"><strong>Not even close.</strong></p>



<h4 class="wp-block-heading">Share it!</h4>



<p class="wp-block-paragraph">Get out there and share that article! Put it on social media. Link to it everywhere and ask your people to clap for it if you’re new at this. Text your mom and ask for claps. I know that’s no fun, but I promise you that after your story has about 50 claps or so, the claps come a lot easier for people. No one wants to give the first few claps, so beg, borrow, and steal them if you can. You won’t need to do this forever.</p>



<p class="wp-block-paragraph">Think about places you can share your posts. Put that link on Facebook, Twitter, and LinkedIn. Put it on Instagram or Pinterest if that makes sense. Are you in any online groups where your link might be appropriate? Submit to Hacker News if your post is tech-related. Reddit, Stumble Upon, and Digg are always out there for sharing.</p>



<p class="wp-block-paragraph">You have so many options!</p>



<p class="wp-block-paragraph">If you want to go the extra mile and you have a few pieces to share, consider scheduling your posts through a social media management platform like <a href="https://buffer.com/" rel="noreferrer noopener" target="_blank">Buffer</a>. There’s a little work involved in loading your links and creating your posts. Once they’re in there, though, you can get things set up and forget about it for a few days. Buffer will figure out when the best times to post are and send out your social media stuff for you.</p>



<p class="wp-block-paragraph">Don’t post your article once and then forget it if you want views. Keep putting it out there. <strong>No one remembers that link they saw on Twitter four days ago.</strong> Put it up again and someone new will see it!</p>



<p class="wp-block-paragraph">Help people see you. The internet is enormous and no one is going to find you hiding alone out there in the dark, too proud to light a couple of flares. Use those links!</p>



<p class="wp-block-paragraph">Google loves links, which is another great reason to make sure they’re out there. The Great Google Algorithm seems to prefer posts that have a lot of links to them. Go back to your old posts and add a link or two to the new one. Add a link in your new post to one or two of your old articles! Whenever you post, share a link to your article anywhere it makes sense to share it. Are you in any groups on social media where people share their posts? You should be! Share it there. Participate in those groups as much as you can.</p>



<p class="wp-block-paragraph">It’s overwhelming to try to stay on top of it, but you want that community. Find a Facebook group or two! There are a lot of good ones out there. You might want to check out <a href="https://www.facebook.com/groups/mediummastery/" rel="noreferrer noopener" target="_blank">Medium Mastery</a>, which is a solid and well-established group.</p>



<h4 class="wp-block-heading">Keep writing</h4>



<p class="wp-block-paragraph">Whether you publish once a month or ten times a day, keep writing. The more often you can put good pieces out there, the more people will find you and read what you’ve written. Even publishing only every week or two, my stats took a hit when things got tough over here and I didn’t publish anything for three weeks.</p>



<p class="wp-block-paragraph">When you finish a piece, take a minute to celebrate and then start the process over again.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2Aw0wCPtr73bE4Bz9QQUHs4w.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Image by <a href="https://pixabay.com/users/Pexels-2286921/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=1836510" rel="noreferrer noopener" target="_blank">Pexels</a> from&nbsp;<a href="https://pixabay.com/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=1836510" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<p class="wp-block-paragraph">Write something that you’re proud of, share it everywhere, and then write something even better and share that too. Don’t stop sharing. There are thousands of people out there who would love to read what you’ve written. Most of them will miss your post when you publish it. No one is checking every page of Medium every day. That would be impossible.</p>



<p class="wp-block-paragraph">Raise your hand, shine some light, and share your hard work with the world.</p>



<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/800/0*gluQlTTWEfxOJm87" alt=""/></figure>



<p class="wp-block-paragraph">&nbsp;<em style="font-size: 14px;">Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@xangriffin?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer" data-href="https://unsplash.com/@xangriffin?utm_source=medium&amp;utm_medium=referral">Xan Griffin</a> on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer" data-href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></em></p>



<p class="wp-block-paragraph">As always, if you do anything cool with this information, let people know about it in the comments below or reach out anytime on LinkedIn @annebonnerdata!<br>If you want to take a look at some of the other pieces I’ve written for examples of whitespace, images, and post length, head on over to my profile.</p>



<p class="wp-block-paragraph">Thanks for reading!</p>


<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/how-to-write-and-publish-articles-that-get-noticed/">How to Write and Publish Articles That Get Noticed</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">735</post-id>	</item>
		<item>
		<title>The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization</title>
		<link>https://contentsimplicity.com/the-ultimate-guide-to-data-cleaning/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-ultimate-guide-to-data-cleaning&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-ultimate-guide-to-data-cleaning</link>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Wed, 14 Oct 2020 17:13:22 +0000</pubDate>
				<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=818</guid>

					<description><![CDATA[<p>If your model has acceptable results but isn’t amazing, take a look at your data! Cleaning and preprocessing your data will make your model a star.</p>
<p>The post <a href="https://contentsimplicity.com/the-ultimate-guide-to-data-cleaning/">The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[<section class="section section--body">
<div class="section-divider"><hr class="section-divider" /></div>
<div class="section-content">
<div class="section-inner sectionLayout--insetColumn">
<p class="graf graf--p"><em>Photo by <a class="markup--anchor markup--p-anchor" href="https://www.pexels.com/@tasveerwala?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@tasveerwala?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Nitin Sharma </a>from <a class="markup--anchor markup--p-anchor" href="https://www.pexels.com/photo/grey-and-white-monkeys-sitting-near-tree-2861847/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/grey-and-white-monkeys-sitting-near-tree-2861847/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></em></p>
<h4> </h4>
<h2 class="graf graf--h4">How to take your model from unremarkable to amazing simply by cleaning and preprocessing your data</h2>
</div>
<p>Data cleaning done right will change your life.</p>
<div class="section-inner sectionLayout--insetColumn">
<p class="graf graf--p">If you have a model that has acceptable results but isn’t amazing, take a look at your data! Taking the time to clean and preprocess your data the right way can make your model a star.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AnWzs22HHAw0yEC5E1kUYQg.jpeg?w=1080&#038;ssl=1" data-image-id="1*nWzs22HHAw0yEC5E1kUYQg.jpeg" data-width="4460" data-height="2973" />
<figcaption class="imageCaption"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/@burst?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@burst?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Burst </a>from <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/photo/adult-tan-and-white-french-bulldog-545063/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/adult-tan-and-white-french-bulldog-545063/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></em></figcaption>
</figure>
<p class="graf graf--p">In order to look at scraping and preprocessing in more detail, let’s look at some of the work that went into “You Are What You Tweet: Detecting Depression in Social Media via Twitter Usage.” That way, we can really examine the process of scraping Tweets and then cleaning and preprocessing them. We’ll also do a little exploratory visualization, which is an awesome way to get a better sense of what your data looks like! We’re going to do some of the most basic cleaning and preprocessing work here: it’s up to you to really get these Tweets in order when you’re building your model!</p>
<p> </p>
<div class="graf graf--mixtapeEmbed"><a class="markup--anchor markup--mixtapeEmbed-anchor" title="https://towardsdatascience.com/you-are-what-you-tweet-7e23fb84f4ed" href="https://towardsdatascience.com/you-are-what-you-tweet-7e23fb84f4ed" data-href="https://towardsdatascience.com/you-are-what-you-tweet-7e23fb84f4ed"><strong class="markup--strong markup--mixtapeEmbed-strong">You Are What You Tweet</strong><br /><em class="markup--em markup--mixtapeEmbed-em">Detecting Depression in Social Media via Twitter Usage</em></a></div>
<div class="graf graf--mixtapeEmbed"><a class="markup--anchor markup--mixtapeEmbed-anchor" title="https://towardsdatascience.com/you-are-what-you-tweet-7e23fb84f4ed" href="https://towardsdatascience.com/you-are-what-you-tweet-7e23fb84f4ed" data-href="https://towardsdatascience.com/you-are-what-you-tweet-7e23fb84f4ed">towardsdatascience.com</a></div>
<h3> </h3>
<h3 class="graf graf--h3">A little background</h3>
<p class="graf graf--p">More than 300 million people suffer from depression and only a fraction receive adequate treatment. Depression is the leading cause of disability worldwide and nearly 800,000 people every year die due to suicide. Suicide is the second leading cause of death in 15–29-year-olds. Diagnoses (and subsequent treatment) for depression are often delayed, imprecise, and/or missed entirely.</p>
<p class="graf graf--p">It doesn’t have to be this way! Social media provides an unprecedented opportunity to transform early depression intervention services, particularly in young adults.</p>
<p class="graf graf--p">Every second, approximately 6,000 Tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. Pew Research Center states that currently, 72% of the public uses some type of social media. This project captures and analyses linguistic markers associated with the onset and persistence of depressive symptoms in order to build an algorithm that can effectively predict depression. By building an algorithm that can analyze Tweets exhibiting self-assessed depressive features, it will be possible for individuals, parents, caregivers, and medical professionals to analyze social media posts for linguistic clues that signal deteriorating mental health far before traditional approaches currently do. Analyzing linguistic markers in social media posts allows for a low-profile assessment that can complement traditional services and would allow for a much earlier awareness of depressive signs than traditional approaches.</p>
<p> </p>
<h3 class="graf graf--h3">Where do we start?</h3>
<p class="graf graf--p">We need data!</p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AlJ6W1hxaVs-jwk0UmVPo3Q.jpeg?w=1080&#038;ssl=1" data-image-id="1*lJ6W1hxaVs-jwk0UmVPo3Q.jpeg" data-width="4334" data-height="2889" />
<figcaption class="imageCaption"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/@quang-nguyen-vinh-222549?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@quang-nguyen-vinh-222549?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Quang Nguyen Vinh </a>from <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/photo/black-and-tan-smooth-chihuahua-in-blue-and-white-plastic-basket-2135383/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/black-and-tan-smooth-chihuahua-in-blue-and-white-plastic-basket-2135383/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></em></figcaption>
</figure>
<h3> </h3>
<h3 class="graf graf--h3">Gathering Data</h3>
<p> </p>
<p class="graf graf--p">In order to build a depression detector, there were two kinds of tweets that were needed: random tweets that do not necessarily indicate depression and tweets that demonstrate that the user may have depression and/or depressive symptoms. A dataset of random tweets can be sourced from the <a class="markup--anchor markup--p-anchor" href="https://www.kaggle.com/kazanova/sentiment140" target="_blank" rel="noopener noreferrer" data-href="https://www.kaggle.com/kazanova/sentiment140">Sentiment140 dataset available on Kaggle</a>, but for this binary classification model, this <a class="markup--anchor markup--p-anchor" href="https://www.kaggle.com/ywang311/twitter-sentiment/data" target="_blank" rel="noopener noreferrer" data-href="https://www.kaggle.com/ywang311/twitter-sentiment/data">dataset which utilizes the Sentiment140 dataset</a> and offers a set of binary labels proved to be the most effective for building a robust model. There are no publicly available datasets of tweets indicating depression, so “depressive” Tweets were retrieved using the Twitter scraping tool TWINT. The scraped Tweets were manually checked for relevance (for example, Tweets indicating emotional rather than economic or atmospheric depression) and Tweets were cleaned and processed. Tweets were collected by searching for terms specifically related to depression, specifically to lexical terms as identified in the unigram by De Choudhury, et. al. </p>
<p class="graf graf--p"><a class="markup--anchor markup--p-anchor" href="https://github.com/twintproject/twint" target="_blank" rel="noopener noreferrer" data-href="https://github.com/twintproject/twint">TWINT</a> is a remarkably simple tool to use! </p>
<p class="graf graf--p">You can download it right from the command line with:</p>
<pre class="graf graf--pre">pip install twint</pre>
<p class="graf graf--p">If you want to, for example, search for the term “depression” on July 20, 2019 and store the data as a new csv named “depression,” you would run a command like:</p>
<pre class="graf graf--pre">twint -s "depression" --since 2019-07-20 -o depression —csv</pre>
<p class="graf graf--p">Once you’ve gathered the Tweets, you can start cleaning and preprocessing them. You’ll probably wind up with a ton of information that you don’t need, like conversation ids and so on. You may decide to create multiple CSVs that you want to combine. We’ll get to all of that!</p>
<p> </p>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">How did the model perform?</strong></h3>
<p> </p>
<p class="graf graf--p">At first? Not that impressively. After a basic cleaning and preprocessing of the data, the best results (even after spending time fine-tuning the model) hovered around 80%. </p>
<p class="graf graf--p">The reason for that really made sense after I examined word frequency and bigrams. Explore your data! Once I looked at the words themselves, I realized that it was going to take a lot of work to clean and prepare the dataset the right way, and that doing so was an absolute necessity. Part of the cleaning process had to be done manually, so don’t be afraid to get in there and get your hands dirty. It takes time, but it’s worth it!</p>
<p class="graf graf--p">In the end? The accuracy of the model was evaluated and compared to a binary classification baseline model using logistic regression. The models were analyzed for accuracy and a classification report was run to determine precision and recall scores. The data were split into training, testing, and validation sets and the accuracy for the model was determined based on the model’s performance with the testing data, which were kept separate. While the performance of the benchmark logistic regression model was 64.32% using the same data, learning rate, and epochs, the LSTM model performed significantly better at 97.21%.</p>
<p class="graf graf--p">So how did we get from the scraped Tweets to the results?</p>
<p class="graf graf--p">Practice, practice, practice! (And some serious work.)</p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2Ac85P3tbdIr3dmmsB0fAIZA.jpeg?w=1080&#038;ssl=1" data-image-id="1*c85P3tbdIr3dmmsB0fAIZA.jpeg" data-width="3424" data-height="2283" />
<figcaption class="imageCaption"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/@dsd-143941?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@dsd-143941?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">DSD </a>from <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/photo/close-up-photo-of-monkey-on-tree-branch-1829979/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/close-up-photo-of-monkey-on-tree-branch-1829979/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a><br /><br /></em></figcaption>
</figure>
<h3 class="graf graf--h3">Basic Data Cleaning and Preprocessing</h3>
<p> </p>
<p class="graf graf--p">Let’s say we scraped Twitter for the search terms “depression,” “depressed,” “hopeless,” “lonely,” “suicide,” and “antidepressant” and we saved those files of scraped Tweets as, for example, “depression” in the file “tweets.csv” and so on.</p>
<p class="graf graf--p">We’ll start with a few imports</p>
<pre class="graf graf--pre"><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">pandas</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">pd</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">numpy</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">np</strong><br /><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">pandas</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">pd</strong>  <br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">numpy</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">np</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">matplotlib.pyplot</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">plt</strong><br />plt.style.use('fivethirtyeight')<br /><br />%matplotlib inline<br />%config InlineBackend.figure_format = 'retina'<br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">re</strong><br /><strong class="markup--strong markup--pre-strong">from</strong> <strong class="markup--strong markup--pre-strong">nltk.tokenize</strong> <strong class="markup--strong markup--pre-strong">import</strong> WordPunctTokenizer<br />tok = WordPunctTokenizer()</pre>
<p class="graf graf--p">We’ll read one of our CSV files and take a look at the head.</p>
<p> </p>
<pre class="graf graf--pre">hopeless_tweets_df = pd.read_csv('hopeless/tweets.csv')<br />hopeless_tweets_df.head()<br /><br /></pre>
<pre class="graf graf--pre graf--empty"> </pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2ArlQKPvFCmlzKDSb55QwnOw.png?w=1080&#038;ssl=1" data-image-id="1*rlQKPvFCmlzKDSb55QwnOw.png" data-width="1152" data-height="379" /></figure>
<p> </p>
<p class="graf graf--p">First of all, we should get rid of any of the information stored in the datasets that aren’t necessary. We don’t need names, ids, conversation ids, geolocations, and so on for this project. We can get those out of there with:</p>
</div>
<p> </p>
<div class="section-inner sectionLayout--insetColumn">
<pre class="graf graf--pre">hopeless_tweets_df.drop(['date', 'timezone', 'username', 'name', 'conversation_id', 'created_at', 'user_id', 'place', 'likes_count', 'link', 'retweet', 'quote_url', 'video', 'user_rt_id', 'near', 'geo', 'mentions', 'urls', 'photos', 'replies_count', 'retweets_count'], axis = 1, inplace = <strong class="markup--strong markup--pre-strong">True</strong>)<br /><br /></pre>
<p class="graf graf--p">Now we have this, which is much easier to deal with!</p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AfpTvVyI02_vjUdjCJnK33w.png?w=1080&#038;ssl=1" data-image-id="1*fpTvVyI02_vjUdjCJnK33w.png" data-width="1152" data-height="273" /></figure>
<p> </p>
<p class="graf graf--p">Now just do that with all of the CSVs you created with your search terms and we can combine our separate datasets into one! </p>
<p> </p>
<pre class="graf graf--pre">df_row_reindex = pd.concat([depression_tweets_df, hopeless_tweets_df, lonely_tweets_df, antidepressant_tweets_df, antidepressants_tweets_df, suicide_tweets_df], ignore_index=<strong class="markup--strong markup--pre-strong">True</strong>)<br /><br />df_row_reindex<br /><br /></pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A0Lwgfpu7JJ5jRoW9JeydrQ.png?w=1080&#038;ssl=1" data-image-id="1*0Lwgfpu7JJ5jRoW9JeydrQ.png" data-width="1152" data-height="891" /></figure>
<p> </p>
<p class="graf graf--p">Before we go any further, let’s drop the duplicates </p>
<p> </p>
<pre class="graf graf--pre">depressive_twint_tweets_df = df.drop_duplicates()</pre>
<p> </p>
<p class="graf graf--p">And save our dataset as a new CSV!</p>
<p> </p>
<pre class="graf graf--pre">export_csv = depressive_twint_tweets_df.to_csv(r'depressive_unigram_tweets_final.csv')<br /><br /></pre>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">More Advanced Preprocessing</strong></h3>
<p> </p>
<p class="graf graf--p">Before the data could be used in the model, it was necessary to expand contractions, remove links, hashtags, capitalization, and punctuation. Negations needed to be dealt with. That meant creating a dictionary of negations so that negated words could be effectively handled. Links and URLs needed to be removed along with whitespaces. Additionally, stop words beyond the standard NLTK stop words needed to be removed to make the model more robust. These words included days of the week and their abbreviations, month names, and the word “Twitter,” which surprisingly showed up as a prominently featured word when the word clouds were created. The tweets were then tokenized and PorterStemmer was utilized to stem the tweets.</p>
<p class="graf graf--p">Let’s take out all of the stuff that isn’t going to help us!</p>
<p class="graf graf--p">Imports, of course</p>
<pre class="graf graf--pre"><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">pandas</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">pd</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">numpy</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">np</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">matplotlib.pyplot</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">plt</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">seaborn</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">sns</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">itertools</strong><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">collections<br />import</strong> <strong class="markup--strong markup--pre-strong">re<br />import</strong> <strong class="markup--strong markup--pre-strong">networkx</strong> <strong class="markup--strong markup--pre-strong">as</strong> <strong class="markup--strong markup--pre-strong">nx</strong></pre>
<pre class="graf graf--pre"><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">nltk<br /></strong>nltk.download(['punkt','stopwords'])<br /><strong class="markup--strong markup--pre-strong">from</strong> <strong class="markup--strong markup--pre-strong">nltk.corpus</strong> <strong class="markup--strong markup--pre-strong">import</strong> stopwords<br />stopwords = stopwords.words('english')<br /><strong class="markup--strong markup--pre-strong">from</strong> <strong class="markup--strong markup--pre-strong">nltk.corpus</strong> <strong class="markup--strong markup--pre-strong">import</strong> stopwords<br /><strong class="markup--strong markup--pre-strong">from</strong> <strong class="markup--strong markup--pre-strong">nltk</strong> <strong class="markup--strong markup--pre-strong">import</strong> bigrams<br /><br /><strong class="markup--strong markup--pre-strong">import</strong> <strong class="markup--strong markup--pre-strong">warnings</strong><br />warnings.filterwarnings("ignore")<br /><br />sns.set(font_scale=1.5)<br />sns.set_style("whitegrid")<br /><strong class="markup--strong markup--pre-strong">from</strong> <strong class="markup--strong markup--pre-strong">vaderSentiment.vaderSentiment</strong> <strong class="markup--strong markup--pre-strong">import</strong> SentimentIntensityAnalyzer<br />analyzer = SentimentIntensityAnalyzer()%matplotlib inline<br />%config InlineBackend.figure_format = 'retina'</pre>
<p> </p>
<p class="graf graf--p">Read in your new CSV</p>
<pre class="graf graf--pre"><br />pd.read_csv('depressive_unigram_tweets_final.csv')</pre>
<p> </p>
<p class="graf graf--p">Turn it into a Pandas dataframe</p>
<pre class="graf graf--pre"><br />df2 = pd.read_csv('depressive_unigram_tweets_final.csv')</pre>
<p> </p>
<p class="graf graf--p">Now let’s see if there are any null values. Let’s clean it up!</p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2ASdKtF_D680jN8WMZWBKrUw.png?w=1080&#038;ssl=1" data-image-id="1*SdKtF_D680jN8WMZWBKrUw.png" data-width="1152" data-height="808" /></figure>
<p> </p>
<p class="graf graf--p">We’ll quickly remove stopwords from the Tweets with</p>
<p> </p>
<pre class="graf graf--pre">df_new['clean_tweet'] = df_new['tweet'].apply(<strong class="markup--strong markup--pre-strong">lambda</strong> x: ' '.join([item <strong class="markup--strong markup--pre-strong">for</strong> item <strong class="markup--strong markup--pre-strong">in</strong> x.split() <strong class="markup--strong markup--pre-strong">if</strong> item <strong class="markup--strong markup--pre-strong">not</strong> <strong class="markup--strong markup--pre-strong">in</strong> stopwords]))<br /><br /></pre>
<p class="graf graf--p">If you want to, you can analyze the Tweets for VADER sentiment analysis scores!</p>
<p> </p>
<pre class="graf graf--pre">df_new['vader_score'] = df_new['clean_tweet'].apply(<strong class="markup--strong markup--pre-strong">lambda</strong> x: analyzer.polarity_scores(x)['compound'])<br /><br /></pre>
<p class="graf graf--p">From there, you can also create labels. For a binary classification model, you may want a binary labelling system. However, be aware of your data! Sentiment scores alone do not indicate depression and it is far too simplistic to assume that a negative score indicates depression. In fact, anhedonia, or loss of pleasure, is an extremely common symptom of depression. Neutral, or flat, Tweets are at least as likely, if not more likely, to be an indicator of depression and should not be ignored. </p>
<p class="graf graf--p">For the purposes of experimentation, you may want to set a sentiment analysis label like this. Feel free to play around with it!</p>
<p> </p>
<pre class="graf graf--pre">positive_num = len(df_new[df_new['vader_score'] &gt;=0.05]) negative_num = len(df_new[df_new['vader_score']&lt;0.05])</pre>
<pre class="graf graf--pre">df_new['vader_sentiment_label']= df_new['vader_score'].map(<strong class="markup--strong markup--pre-strong">lambda</strong> x:int(1) <strong class="markup--strong markup--pre-strong">if</strong> x&gt;=0.05 <strong class="markup--strong markup--pre-strong">else</strong> int(0))<br /><br /></pre>
<p class="graf graf--p">If you need to, drop what you don’t need</p>
<p> </p>
<pre class="graf graf--pre">df_new = df_new[['Unnamed: 0', 'vader_sentiment_label', 'vader_score', 'clean_tweet']]</pre>
<pre class="graf graf--pre">df_new.head()<br /><br /></pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2Ak7UtBi8Q0qa3lJf-qgAaRQ.png?w=1080&#038;ssl=1" data-image-id="1*k7UtBi8Q0qa3lJf-qgAaRQ.png" data-width="1152" data-height="233" /></figure>
<p> </p>
<p class="graf graf--p">Go ahead and save a csv!</p>
<pre class="graf graf--pre"><br />df_new.to_csv('vader_processed_final.csv')</pre>
<p> </p>
<p class="graf graf--p">Let’s keep playing!</p>
<pre class="graf graf--pre"><br />df_new['text'] = df_new['clean_tweet']<br />df_new['text']<br /><br /></pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2ASNuIYjA-b8ptmwdlx9sWNg.png?w=1080&#038;ssl=1" data-image-id="1*SNuIYjA-b8ptmwdlx9sWNg.png" data-width="1152" data-height="292" /></figure>
<p> </p>
<p class="graf graf--p">We can remove URLs</p>
<p> </p>
<pre class="graf graf--pre"><strong class="markup--strong markup--pre-strong">def</strong> remove_url(txt):<br />    <strong class="markup--strong markup--pre-strong">return</strong> " ".join(re.sub("([^0-9A-Za-z <strong class="markup--strong markup--pre-strong">\t</strong>])|(\w+:\/\/\S+)", "", txt).split())</pre>
<pre class="graf graf--pre">all_tweets_no_urls = [remove_url(tweet) <strong class="markup--strong markup--pre-strong">for</strong> tweet <strong class="markup--strong markup--pre-strong">in</strong> df_new['text']]<br />all_tweets_no_urls[:5]</pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A_Ny3bbtlQbogiX9jj_epJg.png?w=1080&#038;ssl=1" data-image-id="1*_Ny3bbtlQbogiX9jj_epJg.png" data-width="1152" data-height="175" /></figure>
<p> </p>
<p class="graf graf--p">Now let’s make everything lowercase and split the Tweets.</p>
<pre class="graf graf--pre"><em class="markup--em markup--pre-em"><br />#lower_case = [word.lower() for word in df_new['text']]</em><br />sentences = df_new['text']</pre>
<pre class="graf graf--pre">all_tweets_no_urls[0].split()</pre>
<pre class="graf graf--pre">words_in_tweet = [tweet.lower().split() <strong class="markup--strong markup--pre-strong">for</strong> tweet <strong class="markup--strong markup--pre-strong">in</strong> all_tweets_no_urls]<br />words_in_tweet[:2]</pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AzFYb9KQ-ruNM4WKxx7doFQ.png?w=1080&#038;ssl=1" data-image-id="1*zFYb9KQ-ruNM4WKxx7doFQ.png" data-width="1930" data-height="1070" /></figure>
<h3> </h3>
<h3 class="graf graf--h3">Data cleaning done manually</h3>
<p> </p>
<p class="graf graf--p">It’s not fun and it’s not pretty, but manual cleaning was critical. It took hours, but getting rid of references to things like tropical depressions and economic depressions improved the model. Removing Tweets that were movie titles improved the model (you can see “Suicide Squad” in the bigrams below). Removing quoted news headlines that included the search terms improved the model. It felt like it took an eternity to do, but this step made an enormous difference in the robustness of the model.</p>
<p> </p>
<h3 class="graf graf--h3">Exploratory Visualization and Analysis</h3>
<p> </p>
<p class="graf graf--p">Now let’s look at character and word frequency!</p>
<p class="graf graf--p">It‘s fairly easy to analyze the most common words found in the dataset. After removing the stop words, it was apparent that there were certain words that appeared much more frequently than other words.</p>
<p class="graf graf--p">Let’s count our most common words!</p>
<p> </p>
<pre class="graf graf--pre"><em class="markup--em markup--pre-em"># List of all words</em><br />all_words_no_urls = list(itertools.chain(*words_in_tweet))<br /><br /><em class="markup--em markup--pre-em"># Create counter</em><br />counts_no_urls = collections.Counter(all_words_no_urls)<br /><br />counts_no_urls.most_common(15)</pre>
<p> </p>
<p class="graf graf--p">And turn them into a dataframe.</p>
<p> </p>
<pre class="graf graf--pre">clean_tweets_no_urls = pd.DataFrame(counts_no_urls.most_common(15),<br />                             columns=['words', 'count'])<br /><br />clean_tweets_no_urls.head()<br /><br /></pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AByEy9b_gz852j1s9tcZ0rQ.png?w=1080&#038;ssl=1" data-image-id="1*ByEy9b_gz852j1s9tcZ0rQ.png" data-width="1152" data-height="642" /></figure>
<p> </p>
<p class="graf graf--p">Hmmm. Too many stopwords. Let’s deal with those.</p>
<pre class="graf graf--pre"><br />stop_words = set(stopwords.words('english'))</pre>
<pre class="graf graf--pre"><em class="markup--em markup--pre-em"># Remove stop words from each tweet list of words</em><br />tweets_nsw = [[word <strong class="markup--strong markup--pre-strong">for</strong> word <strong class="markup--strong markup--pre-strong">in</strong> tweet_words <strong class="markup--strong markup--pre-strong">if</strong> <strong class="markup--strong markup--pre-strong">not</strong> word <strong class="markup--strong markup--pre-strong">in</strong> stop_words]<br />              <strong class="markup--strong markup--pre-strong">for</strong> tweet_words <strong class="markup--strong markup--pre-strong">in</strong> words_in_tweet]<br /><br />tweets_nsw[0]<br /><br /></pre>
<p class="graf graf--p">Let’s take another look.</p>
<p> </p>
<pre class="graf graf--pre">all_words_nsw = list(itertools.chain(*tweets_nsw))  counts_nsw = collections.Counter(all_words_nsw)  counts_nsw.most_common(15)<br /><br /></pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AUNod3L09ZuPZpIX9911cRA.png?w=1080&#038;ssl=1" data-image-id="1*UNod3L09ZuPZpIX9911cRA.png" data-width="1152" data-height="313" /></figure>
<p> </p>
<p class="graf graf--p">Better, but not great yet. Some of these words don’t tell us much. Let’s make a few more adjustments.</p>
<pre class="graf graf--pre"><br />collection_words = ['im', 'de', 'like', 'one']<br />tweets_nsw_nc = [[w <strong class="markup--strong markup--pre-strong">for</strong> w <strong class="markup--strong markup--pre-strong">in</strong> word <strong class="markup--strong markup--pre-strong">if</strong> <strong class="markup--strong markup--pre-strong">not</strong> w <strong class="markup--strong markup--pre-strong">in</strong> collection_words]<br />                 <strong class="markup--strong markup--pre-strong">for</strong> word <strong class="markup--strong markup--pre-strong">in</strong> tweets_nsw]</pre>
<p> </p>
<p class="graf graf--p">Now</p>
<pre class="graf graf--pre"><em class="markup--em markup--pre-em"><br /># Flatten list of words in clean tweets</em><br />all_words_nsw_nc = list(itertools.chain(*tweets_nsw_nc))<br /><br /><em class="markup--em markup--pre-em"># Create counter of words in clean tweets</em><br />counts_nsw_nc = collections.Counter(all_words_nsw_nc)<br /><br />counts_nsw_nc.most_common(15)<br /><br /></pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2Ayrqdj5Hdx3C8YRdwFpeqXA.png?w=1080&#038;ssl=1" data-image-id="1*yrqdj5Hdx3C8YRdwFpeqXA.png" data-width="1152" data-height="313" /></figure>
<p> </p>
<p class="graf graf--p">Much better! Let’s save this as a dataframe.</p>
<pre class="graf graf--pre"><br />clean_tweets_ncw = pd.DataFrame(counts_nsw_nc.most_common(15),<br />                             columns=['words', 'count'])<br />clean_tweets_ncw.head()</pre>
<p class="graf graf--p">What does that look like? Let’s visualize it!</p>
<pre class="graf graf--pre">fig, ax = plt.subplots(figsize=(8, 8))<br /><br /><em class="markup--em markup--pre-em"># Plot horizontal bar graph</em><br />clean_tweets_no_urls.sort_values(by='count').plot.barh(x='words',<br />                      y='count',<br />                      ax=ax,<br />                      color="purple")<br /><br />ax.set_title("Common Words Found in Tweets (Including All Words)")<br /><br />plt.show()</pre>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2As3SnsYx4-rwPYvF6Uq7fYQ.png?w=1080&#038;ssl=1" data-image-id="1*s3SnsYx4-rwPYvF6Uq7fYQ.png" data-width="1152" data-height="592" /></figure>
<p> </p>
<p class="graf graf--p">Let’s look at some bigrams!</p>
<pre class="graf graf--pre"><strong class="markup--strong markup--pre-strong"><br />from</strong> <strong class="markup--strong markup--pre-strong">nltk</strong> <strong class="markup--strong markup--pre-strong">import</strong> bigrams<br /><br /><em class="markup--em markup--pre-em"># Create list of lists containing bigrams in tweets</em><br />terms_bigram = [list(bigrams(tweet)) <strong class="markup--strong markup--pre-strong">for</strong> tweet <strong class="markup--strong markup--pre-strong">in</strong> tweets_nsw_nc]<br /><br /><em class="markup--em markup--pre-em"># View bigrams for the first tweet</em><br />terms_bigram[0]</pre>
<pre class="graf graf--pre"><em class="markup--em markup--pre-em"># Flatten list of bigrams in clean tweets</em><br />bigrams = list(itertools.chain(*terms_bigram))<br /><br /><em class="markup--em markup--pre-em"># Create counter of words in clean bigrams</em><br />bigram_counts = collections.Counter(bigrams)<br /><br />bigram_counts.most_common(20)</pre>
<pre class="graf graf--pre">bigram_df = pd.DataFrame(bigram_counts.most_common(20),                              columns=['bigram', 'count'])  bigram_df</pre>
<p> </p>
<p class="graf graf--p">Certain bigrams were also extremely common, including smile and wide, appearing 42,185 times, afraid and loneliness, appearing 4,641 times, and feel and lonely, appearing 3,541 times.</p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A7liLG0ezzKMFtwa-AjrA3Q.png?w=1080&#038;ssl=1" data-image-id="1*7liLG0ezzKMFtwa-AjrA3Q.png" data-width="288" data-height="516" /></figure>
<p> </p>
<p class="graf graf--p">This is just the beginning of cleaning, preprocessing, and visualizing the data. We can still do a lot from here before we build our model!</p>
<p class="graf graf--p">Once the Tweets were cleaned, it was easy to see the difference between the two datasets by creating a word cloud with the cleaned Tweets. With only an abbreviated TWINT Twitter scraping, the differences between the two datasets were clear:</p>
<p> </p>
<p class="graf graf--p"><strong class="markup--strong markup--p-strong">Random Tweet Word Cloud:</strong></p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AvUsD6skX6B5c3qwweSHarw.png?w=1080&#038;ssl=1" data-image-id="1*vUsD6skX6B5c3qwweSHarw.png" data-width="1132" data-height="737" /></figure>
<p> </p>
<p class="graf graf--p"><strong class="markup--strong markup--p-strong">Depressive Tweet Word Cloud:</strong></p>
<p> </p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AesvfGDnq8eYhVJiaDm5VSg.png?w=1080&#038;ssl=1" data-image-id="1*esvfGDnq8eYhVJiaDm5VSg.png" data-width="1096" data-height="684" /></figure>
<p> </p>
<p class="graf graf--p">Early in the process, it became clear that the most important part of refining the model to get more accurate results would be the data gathering, cleaning, and preprocessing stage. Until the Tweets were appropriately scraped and cleaned, the model had unimpressive accuracy. By cleaning and processing the Tweets with more care, the robustness of the model improved to 97%.</p>
<p> </p>
</div>
</div>
</section>
<section class="section section--body">
<div class="section-divider"><hr class="section-divider" /></div>
<div class="section-content">
<div class="section-inner sectionLayout--insetColumn">
<p> </p>
<p class="graf graf--p">If you’re interested in learning about the absolute basics of data cleaning and preprocessing, take a look at this article!</p>
<p> </p>
<div class="graf graf--mixtapeEmbed"><a class="markup--anchor markup--mixtapeEmbed-anchor" title="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d" href="https://contentsimplicity.com/data-cleaning-and-preprocessing/" data-href="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d"><strong class="markup--strong markup--mixtapeEmbed-strong">The complete beginner’s guide to data cleaning and preprocessing</strong><br /><em class="markup--em markup--mixtapeEmbed-em">How to successfully prepare your data for a machine learning model in minutes</em></a></div>
<p> </p>
<p class="graf graf--p">Thanks for reading! As always, if you do anything cool with this information, let everyone know about it in the comments below or <a class="markup--anchor markup--p-anchor" href="https://contentsimplicity.com/articles/" target="_blank" rel="noopener noreferrer" data-href="https://contentsimplicity.com/articles/">reach out any time</a>!</p>
</div>
</div>
</section>


<p class="wp-block-paragraph"></p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/the-ultimate-guide-to-data-cleaning/">The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">818</post-id>	</item>
		<item>
		<title>Getting started with Git and GitHub: the complete beginner’s guide</title>
		<link>https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=getting-started-with-git-and-github-the-complete-beginners-guide&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=getting-started-with-git-and-github-the-complete-beginners-guide</link>
					<comments>https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Fri, 04 Sep 2020 04:44:40 +0000</pubDate>
				<category><![CDATA[github]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[basic]]></category>
		<category><![CDATA[beginner]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[featured]]></category>
		<category><![CDATA[free]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[stepup]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=159</guid>

					<description><![CDATA[<p>It’s incredibly easy to get started with Git and GitHub. If you’re a fast reader (and you don’t take a lot of time with sign up and installation), you can be up and running on GitHub about ten minutes from right now. Let's get started!</p>
<p>The post <a href="https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/">Getting started with Git and GitHub: the complete beginner’s guide</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"><em>Photo by&nbsp;<a rel="noreferrer noopener" href="https://unsplash.com/@jamesbold?utm_source=medium&amp;utm_medium=referral" target="_blank">James Bold</a>&nbsp;on&nbsp;<a rel="noreferrer noopener" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank">Unsplash</a></em></p>



<p class="wp-block-paragraph"> </p>



<p class="wp-block-paragraph">Looking to get started with Git and GitHub? Do you need to collaborate with a team? Are you working on a project? Have you recently discovered that you pretty much need to be on GitHub if you want anyone to take you seriously in tech? </p>



<p class="graf graf--p graf-after--p wp-block-paragraph">…do you really just want to contribute to your&nbsp;<a class="markup--anchor markup--p-anchor" href="https://github.com/bonn0062/github_welcome_wall" target="_blank" rel="noopener noreferrer"><strong class="markup--strong markup--p-strong">first open source project</strong></a>?</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">This one’s for you!</p>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*aUmNuvyIwOyvRPh7" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@greysonjoralemon?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Greyson Joralemon</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--figure is-layout-flow wp-block-quote-is-layout-flow"><p>It’s totally easy to get started with Git. If you’re a fast reader (and you don’t take a lot of time with sign up and installation), you can be up and running on GitHub about ten minutes from right now.</p></blockquote>



<p class="graf graf--p graf-after--blockquote wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">If you go all the way through the article, you can practice cloning an existing repository, creating a branch, making changes, and creating a pull request.</span>Along the way, you&nbsp;<span class="markup--quote markup--p-quote is-other">might also learn how to find your terminal, use terminal commands, and edit a markdown (.md) file!</span></p>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><strong class="markup--strong markup--p-strong">If you do all that, congratulations!</strong></p>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><strong class="markup--strong markup--p-strong">You will have contributed to your&nbsp;</strong><a class="markup--anchor markup--p-anchor" href="https://bonn0062.github.io/github_welcome_wall/" target="_blank" rel="noopener noreferrer"><strong class="markup--strong markup--p-strong">first open source project</strong></a><strong class="markup--strong markup--p-strong"> — the&nbsp;</strong><a class="markup--anchor markup--p-anchor" href="https://bonn0062.github.io/github_welcome_wall/welcome_wall.html" target="_blank" rel="noopener noreferrer"><strong class="markup--strong markup--p-strong">GitHub Welcome Wall</strong></a><strong class="markup--strong markup--p-strong">!&nbsp;</strong>(If you want to go straight to the open source contribution part, scroll down until you hit the section called, “<strong class="markup--strong markup--p-strong">Let’s do this!</strong>”)</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">This article will get you up and running with the basics. There’s a lot of stuff to learn if you want to use Git and GitHub like a pro, of course. You can go way beyond this introductory information! We’re going to leave the next-level stuff for another time, though.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Let’s get started!</p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="9847">What is Git? What’s&nbsp;GitHub?</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph"><span class="markup--quote markup--p-quote is-other"><strong class="markup--strong markup--p-strong">Git</strong>&nbsp;is the version control tech of choice for basically everybody right now, from developers to designers.</span>&nbsp;<strong class="markup--strong markup--p-strong">GitHub</strong>&nbsp;is the social code-hosting platform that’s currently used more than any other. It’s a place where you can play and experiment. It’s a place where you can find (and play around with) the most incredible open-source information, emerging technologies, features, and designs. It’s a place to learn and it’s a place to get involved. You can keep code there for work or for school, and you can grab some sweet code that you want to explore further.&nbsp;<span class="markup--quote markup--p-quote is-other">You can even host websites&nbsp;<strong class="markup--strong markup--p-strong">for free</strong>&nbsp;directly from your repository!</span>&nbsp;(<a class="markup--anchor markup--p-anchor" rel="noopener noreferrer" href="https://bonn0062.github.io/github_welcome_wall/welcome_wall.html" target="_blank">Our project is hosted right from the GitHub repository!</a>)</p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*fN_p7JZlmDMOJov8" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@haughters?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Jamie Haughton</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<p class="graf graf--p graf-after--figure wp-block-paragraph">There are a ton of ways to use Git and GitHub, but getting started with GitHub doesn’t have to be overwhelming. You don’t need to be some kind of master coder or anything.&nbsp;<span class="markup--quote markup--p-quote is-other">You can even do the most important things right on the GitHub website!</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">That being said, it’s a good idea to find your terminal and get just the tiniest bit comfortable with it.</span>&nbsp;Terminal commands make things so much faster! I’ll definitely show you how to get started using the GitHub website.&nbsp;<span class="markup--quote markup--p-quote is-other">I’ll also show you some terminal commands that you might want to use to make your life just a little bit nicer.</span></p>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--p is-layout-flow wp-block-quote-is-layout-flow"><p>Any time you see a command in this article that includes these marks:&nbsp;<code class="markup--code markup--blockquote-code">&lt; &gt;</code>&nbsp;, you want to delete those marks and replace what’s between them with your own information.</p></blockquote>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--blockquote is-layout-flow wp-block-quote-is-layout-flow"><p>Let’s say you see something like&nbsp;<code class="markup--code markup--blockquote-code">git add &lt;filename&gt;</code>. That means that you would type, for example,&nbsp;<code class="markup--code markup--blockquote-code">git add hello_world.py</code>&nbsp;if you wanted to add a file named “hello_world.py” to your GitHub repository.</p></blockquote>



<p class="graf graf--p graf-after--blockquote wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">I’m going to give you a lot of explanation here, but these are all the terminal commands that you really need to know to get started:</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git clone
git status
git add
git commit -m “ “
git push</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">That’s it! Those are the big ones! If you have a handle of those, you’re good to go. You can start working on your projects immediately!</p>



<figure id="2f97" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
</div></div></figure>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*BsszeDXPku8ZUglQ" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@delaneykate_?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Delaney Dawson</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--figure wp-block-paragraph">We’ll also talk about</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git init
git branch
git merge
git checkout</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">You might be working with other people, or you might want to make changes and test them out before you really commit them.&nbsp;<span class="markup--quote markup--p-quote is-other">The commands above are what you need to get started with collaboration.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><code class="markup--code markup--p-code">git help</code></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">is also seriously useful if you’re just starting out! We’ll discuss that too.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">(If you’re on a Mac, you already have a terminal! You can search for it by clicking on the magnifying glass icon in the upper right-hand corner of your screen and searching for the word “terminal.” )</p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="1215">Step 1: Sign up and installation!</h3>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">Go to&nbsp;<a class="markup--anchor markup--p-anchor" rel="noopener noreferrer" href="https://github.com/" target="_blank">GitHub</a>&nbsp;and sign up for an account. You could just stop there and GitHub would work just fine. It’s a good idea, though, to&nbsp;<a class="markup--anchor markup--p-anchor" rel="noopener noreferrer" href="https://git-scm.com/downloads" target="_blank">install Git</a>&nbsp;if you haven’t already.&nbsp;<span class="markup--quote markup--p-quote is-other">You can absolutely get started without it, but if you want to work on your local computer, then you want to have Git installed.</span>&nbsp;You can download it or&nbsp;<a class="markup--anchor markup--p-anchor" rel="noopener noreferrer" href="https://gist.github.com/derhuerst/1b15ff4652a867391f03" target="_blank">install it via your package manager</a>&nbsp;instead.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Now go to your terminal and introduce yourself to Git!&nbsp;<span class="markup--quote markup--p-quote is-other">To set your username for&nbsp;<strong class="markup--strong markup--p-strong">every repository</strong>&nbsp;on your computer, type</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p"><span class="markup--quote markup--pre-quote is-other">git config --global user.name</span><span class="markup--quote markup--pre-quote is-other"> "&lt;your_name_here&gt;"</span></pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">replacing “&lt;your name here&gt;” with your own name in quotations. You can use any name or handle you want. If you want to set your name for just one repository, leave out the word “global.”</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Now you can tell Git your email, and make sure it’s the same email you used when you signed up for GitHub</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p"><span class="markup--quote markup--pre-quote is-other">git config --global user.email</span> "&lt;<a class="markup--anchor markup--pre-anchor" href="mailto:your_email@emal.com" target="_blank" rel="noopener noreferrer">your_email@email.com</a>&gt;"</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">It’s easy to keep your email private, and you can find those instructions in&nbsp;<a class="markup--anchor markup--p-anchor" href="https://help.github.com/en/articles/blocking-command-line-pushes-that-expose-your-personal-email-address" target="_blank" rel="noopener noreferrer">this article</a>. You only need to check two boxes in your GitHub account.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><strong class="markup--strong markup--p-strong">Now you’re ready to start using Git on your computer!</strong></p>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*8c8UUki9Yh2yrL9r" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@omgitsmattyvee?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Matty Adame</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--figure wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">To get started, you can create a new repository on the GitHub website or perform a&nbsp;<code class="markup--code markup--p-code">git init</code>&nbsp;to create a new repository from your project directory.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">The repository consists of three ‘trees.’&nbsp;<span class="markup--quote markup--p-quote is-other">First is the&nbsp;<strong class="markup--strong markup--p-strong">working directory</strong>, which holds the actual files.</span>&nbsp;<span class="markup--quote markup--p-quote is-other">The second one is the&nbsp;<strong class="markup--strong markup--p-strong">index</strong>&nbsp;or the staging area.</span>&nbsp;<span class="markup--quote markup--p-quote is-other">Then there’s the&nbsp;<strong class="markup--strong markup--p-strong">head</strong>, which points to the last commit you made.</span></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="18a3">I’m already comfortable with the terminal (Option&nbsp;1)</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">Here’s how you can get started right from the terminal:</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">If you have a project directory, just go to your terminal and in your project directory run the command</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git init</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">If you want to initialize your project&nbsp;<span class="markup--quote markup--p-quote is-other">with all of the files in your project directory, run</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git init .</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">to include everything.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Let’s say you have a folder for your project called “new_project.”&nbsp;<span class="markup--quote markup--p-quote is-other">You could head on over to that folder in your terminal window and add a local repository to it by running</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">cd new_project
git init</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">Now you have a new hidden directory called&nbsp;<code class="markup--code markup--p-code">.git</code>&nbsp;in your project directory.&nbsp;<span class="markup--quote markup--p-quote is-other">This is where Git stores what it needs so that it can track your project.</span>&nbsp;Now you can add files to the staging area one by one with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add &lt;filename_one&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">or run</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add .</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">to add all of your files to the staging area. You can commit these changes with the command</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git commit -m "&lt;add a commit message here&gt;"</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">and if you’re happy with your changes, you can run</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git push</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">to push your changes through.</span>&nbsp;<span class="markup--quote markup--p-quote is-other">You can check whether or not you have changes to push through any time by running</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p"><span class="markup--quote markup--pre-quote is-other">git status</span></pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">If you made some changes, you can update your files on at a time with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add &lt;filename&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">or</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add --all</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Then commit them with your commit message and push them through.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">That’s it! You can now initialize a repository, commit files, commit changes, and push them through to the master branch.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">If you’ve got this, just scroll down to “<strong class="markup--strong markup--p-strong">Learning to work with others”</strong>&nbsp;to move on to branching and collaboration!</p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*qZzwzvj4kkoU8-UH" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@dear_jondog?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Jonathan Daniels</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--figure wp-block-heading" id="d5cc">I don’t know what you just said (Option&nbsp;2)</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">I’m going to assume that anyone who’s interested in option 2 is brand new to all of this and maybe has a folder full of files (or you plan to have one) that you want to put on GitHub and you just don’t know how to do that.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Let’s make that happen!</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Say you want to create a new repository. (You probably do! That’s where your project will live. If you aren’t going to create a new repository, you probably want to clone an existing repository.&nbsp;<span class="markup--quote markup--p-quote is-other">We’ll talk about that next, but that’s how you grab someone else’s project and information that you need for your job or the course you’re taking.)</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Your&nbsp;<strong class="markup--strong markup--p-strong">repository</strong>&nbsp;is where you’ll organize your project.</span>&nbsp;You can keep&nbsp;<span class="markup--quote markup--p-quote is-other">folders, files, images, videos, spreadsheets, Jupyter notebooks, data sets, and anything else your project needs.</span>&nbsp;Before you can work with Git, you have to initialize a repository for your project and set it up so that Git will manage it. You can do this right on the GitHub website.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">It’s a smart idea to include a&nbsp;<strong class="markup--strong markup--p-strong">README</strong>&nbsp;file with information about your project.</span>&nbsp;You can create one at the same time that you create your repository with the click of a checkbox.</p>



<ul class="postList wp-block-list"><li id="271a" class="graf graf--li graf-after--p">Go to the GitHub website, look in the upper right corner, and click the + sign and then click “New repository.”</li></ul>



<ul class="postList wp-block-list"><li id="17ae" class="graf graf--li graf-after--li">Name the repository, and add a quick description.</li></ul>



<ul class="postList wp-block-list"><li>Decide whether you want this to be a public or a private repository</li><li><span class="markup--quote markup--li-quote is-other">Click “Initialize this repository with a README” if you want to include the README file.</span>&nbsp;(I definitely recommend doing this!&nbsp;<span class="markup--quote markup--li-quote is-other">It’s the first thing people are going to look at when they check out your repository.</span>&nbsp;It’s also a great place to put information that you need to have in order to understand or run the project.)</li></ul>



<figure class="wp-block-image graf graf--figure graf-after--li"><img data-recalc-dims="1" decoding="async" src="https://learnbasictech.files.wordpress.com/2019/03/cd0f8-1upckyu4ptowf8igbt11m0g.png?w=1080" alt=""/><figcaption>New repository</figcaption></figure>



<figure class="wp-block-image graf graf--figure graf-after--figure"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/c9602-f0174-1acdw3oav2cgsoqqgraubhg.png?w=1080" alt=""/><figcaption>Creating your new repository</figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--figure wp-block-paragraph">You can totally start working right from this point if you want to! You can upload files, edit files, and so on right from your repository on the GitHub website. However, you might not be satisfied with only this option.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">There are two ways to make changes to your project.&nbsp;<span class="markup--quote markup--p-quote is-other">You can make changes in your files/notebooks on your computer and you can also make changes right on GitHub.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Let’s say you want to make some changes to your README file right on GitHub.</p>



<ul class="postList wp-block-list"><li id="c10b" class="graf graf--li graf-after--p">First, go to your repository.</li></ul>



<ul class="postList wp-block-list"><li id="fb4e" class="graf graf--li graf-after--li">Click the name of the file to bring up that file (for example, click “README.md” to go to the readme file).</li></ul>



<ul class="postList wp-block-list"><li id="f265" class="graf graf--li graf-after--li">Click the pencil icon in the upper right corner of the file and make some changes.</li></ul>



<ul class="postList wp-block-list"><li id="4b48" class="graf graf--li graf-after--li">Write a short message in the box that describes the changes you made (and an extended description if you want).</li><li id="8fc2" class="graf graf--li graf-after--li">Click the “Commit changes” button.</li></ul>



<figure class="wp-block-image graf graf--figure graf-after--li"><img data-recalc-dims="1" decoding="async" src="https://learnbasictech.files.wordpress.com/2019/03/3f5d1-1-txpkgtgyfuyuhrrhmz-1q.png?w=1080" alt=""/><figcaption>Editing your file on&nbsp;GitHub</figcaption></figure>



<figure class="wp-block-image graf graf--figure graf-after--figure"><img data-recalc-dims="1" decoding="async" src="https://learnbasictech.files.wordpress.com/2019/03/14e36-1xvxpr_yyui2bo782crzema.png?w=1080" alt=""/><figcaption>Committing your&nbsp;changes</figcaption></figure>



<p class="graf graf--p graf-after--figure wp-block-paragraph">Now the changes have been made to the README file in your new repository!&nbsp;<span class="markup--quote markup--p-quote is-other">(I quickly want to draw your attention to the little button you can check in the image above that will let you create a new branch for this commit and start a pull request.</span>&nbsp;We’ll talk about this later!)</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Pretty easy, right?</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">I prefer to work with files on my local computer rather than try to make everything work from the GitHub website, so let’s set that up now.</span></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="cd1d">Gimmie that&nbsp;project!</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">You might want to clone your new repository so that you can work on it on your local computer, or you might have an existing repository that you want to clone. (That’s something you might need to do that for a project or course.)</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">In order to&nbsp;<strong class="markup--strong markup--p-strong">clone a repository</strong>&nbsp;onto your computer, go to the repository on the GitHub website and click the big green button that says “Clone or download.”</span>&nbsp;(You can definitely download the repository right there and skip the terminal stuff if you just can’t deal with it. But I believe in you, so keep going!)&nbsp;<span class="markup--quote markup--p-quote is-other">Make sure it says “Clone with HTTPS.”</span>&nbsp;Now click the clipboard icon to copy and paste it to your clipboard (or highlight that link and copy it).</p>



<figure class="wp-block-image graf graf--figure graf-after--p"><img data-recalc-dims="1" decoding="async" src="https://learnbasictech.files.wordpress.com/2019/03/d5b6a-1y2kfams5exnsq-e6jpaqfa.png?w=1080" alt=""/><figcaption>Clone or download a repository</figcaption></figure>



<p class="graf graf--p graf-after--figure wp-block-paragraph">Now you’ll open up your&nbsp;<strong class="markup--strong markup--p-strong">terminal</strong>&nbsp;and get yourself to the place where you want that repository to land. You might be able to, for instance, type</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">cd Desktop</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">to get onto the desktop. Then clone your repository right there to make it easy to find. To clone the repository, you type</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git clone &lt;that_thing_you_just_copied&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">Simple! (Don’t forget to change the information between the&nbsp;<code class="markup--code markup--p-code">&lt; &gt;</code>&nbsp;marks to that string of letters and numbers you just copied! Also, make sure you delete the&nbsp;<code class="markup--code markup--p-code">&lt; &gt;</code>.)</p>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--p is-layout-flow wp-block-quote-is-layout-flow"><p>If you haven’t moved around in your terminal before, you can move around slowly with the&nbsp;<code class="markup--code markup--blockquote-code">cd</code>&nbsp;command until you get where you want to go. For example, open up your terminal and type&nbsp;<code class="markup--code markup--blockquote-code">ls</code>&nbsp;to list the choices of where you might go next. You might see “Desktop” listed, and you could just type&nbsp;<code class="markup--code markup--blockquote-code">cd Desktop</code>&nbsp;to get to your desktop. Then you can run the&nbsp;<code class="markup--code markup--blockquote-code">git clone</code>&nbsp;command above to clone your repository right onto your desktop.</p></blockquote>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--blockquote is-layout-flow wp-block-quote-is-layout-flow"><p>You might see some user names instead of choices like “Desktop.” In that case, you need to choose a user before you see “Desktop,” so choose the user with&nbsp;<code class="markup--code markup--blockquote-code">cd &lt;user&gt;</code>&nbsp;(replacing&nbsp;<code class="markup--code markup--blockquote-code">&lt;user&gt;</code>&nbsp;with the user name) and then type&nbsp;<code class="markup--code markup--blockquote-code">ls</code>&nbsp;again to see your choices. There’s a very good chance you’ll see “Desktop” now. You’ll type&nbsp;<code class="markup--code markup--blockquote-code">cd Desktop</code>&nbsp;if you see the Desktop listed. Now go ahead with that git clone!</p></blockquote>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--blockquote is-layout-flow wp-block-quote-is-layout-flow"><p>If you ever want to move back a step in your terminal, just type&nbsp;<code class="markup--code markup--blockquote-code">cd&nbsp;..</code></p></blockquote>



<p class="graf graf--p graf-after--blockquote wp-block-paragraph">Now you have a new GitHub repository that you can work with cloned right on your desktop!&nbsp;<span class="markup--quote markup--p-quote is-other">That command pulled in a complete copy of the repository right to your system where you can work on it, make changes, stage the changes, commit the changes, and then push the changes back to GitHub.</span></p>



<blockquote class="wp-block-quote graf graf--blockquote graf-after--p is-layout-flow wp-block-quote-is-layout-flow"><p>You don’t need to put the repository on your desktop if you don’t want to. You can clone it anywhere.&nbsp;<span class="markup--quote markup--blockquote-quote is-other">You can even run the&nbsp;<code class="markup--code markup--blockquote-code">git clone</code>&nbsp;command as soon as you open up your terminal.</span>&nbsp;I will say, though, that if you aren’t really comfortable navigating around your computer, it’s not a bad idea to have your project sitting right on your desktop where you can see it…</p></blockquote>



<p class="graf graf--p graf-after--blockquote wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">If you ever want to just play with a project on your own, you can&nbsp;<strong class="markup--strong markup--p-strong">fork</strong>&nbsp;it on the GitHub website instead of cloning it.</span>&nbsp;Look up near the top right corner of the screen for the “fork” button and click it.&nbsp;<span class="markup--quote markup--p-quote is-other">This will make a copy of the repository in your repositories for you to play with on your own without doing anything to the original.</span></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="c829">Now it’s time to add some files to your&nbsp;project!</h3>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*s9srYHodB0cSLrPd" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@merrikh?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Nadim Merrikh</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="graf graf--p graf-after--figure wp-block-paragraph">This is all we’re about to do:</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git status
git add
git commit -m " "
git push</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">Nothing to worry about!</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">I’m thinking you probably have some files that you want to put in your new repository. Go ahead and find your files and drag and drop them into the new folder for the repository that you created on your desktop, just like you normally would with any set of files you might want to move into a folder.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Now, check out the&nbsp;<strong class="markup--strong markup--p-strong">status&nbsp;</strong>of your project!</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Go to your terminal and get yourself into the folder for your repository. Then run</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git status</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">to see if everything is up to date. (If you just dragged some files into your project folder, it definitely isn’t!) To add one of your files to the repository, you would run</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add &lt;fileneame&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Otherwise, you can add everything with</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add --all</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">or even</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git add .</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">These are your proposed changes.&nbsp;<span class="markup--quote markup--p-quote is-other">You can do this exact same thing with brand new files and with files that are already in there but have some changes.</span>&nbsp;You aren’t actually adding anything just yet.&nbsp;<span class="markup--quote markup--p-quote is-other">You’re bringing new files and changes to Git’s attention.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">To commit the changes, you will start the process by running</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git commit -m “&lt;commit message&gt;”</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">You’re committing the changes to the HEAD, but not to the remote repository.</span>(Make sure you replace that message in quotes with your own.)&nbsp;<span class="markup--quote markup--p-quote is-other">After you make a change, you take a “snapshot” of the repository with the “commit” command.</span>&nbsp;You‘ll include a message on that “snapshot” with -m.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">When you save a change, that’s called a commit.&nbsp;<span class="markup--quote markup--p-quote is-other">When you make a commit, you’ll include a message about what you changed and/or why you changed it.</span>This is a great way to let others know what you’ve changed and why.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Now your changes are in the head of your local working copy.</span>&nbsp;<span class="markup--quote markup--p-quote is-other">To send the changes to your remote repository, run</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git push</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">to push your changes right into your repository.</span>&nbsp;If you’re working on your local computer and&nbsp;<span class="markup--quote markup--p-quote is-other">you want your commits to be visible online too, you would push the changes up to git hub with the git push command.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">You can see if everything is up to date any time by running the&nbsp;<code class="markup--code markup--p-code">git status</code>command!</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">So now you have a GitHub repository and you know how to add files and changes to it!</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><strong class="markup--strong markup--p-strong">Congratulations!!!</strong></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h3 graf-after--p wp-block-heading" id="0432">Learning to work with&nbsp;others</h3>



<p class="graf graf--p graf-after--h3 wp-block-paragraph">Collaboration is the name of the game on GitHub!</p>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*Qn_xsXkyf3lzpPrw" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@quinten149?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Quinten de Graaf</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<h3 class="graf graf--h4 graf-after--figure wp-block-heading" id="e84f">GitHub flow</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">Let’s say you have a project going and you maybe have a lot of different ideas and features in mind at any given time. Some features might be ready to go, but some might not. Maybe you’re working with other people who are all kind of doing their own thing. This is where branching comes in!</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">A branch is a separate space where you can try out new ideas.</span>&nbsp;If you change something on a branch,&nbsp;<span class="markup--quote markup--p-quote is-other">it doesn’t affect the master branch until you want it to.</span>&nbsp;This means that you can do whatever you want to do on that branch<span class="markup--quote markup--p-quote is-other">&nbsp;until you decide it’s time to merge it.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">The only branch that’s going to permanently change things is the master branch.</span>&nbsp;If you don’t want your changes to deploy immediately,&nbsp;<span class="markup--quote markup--p-quote is-other">then make your changes on a separate branch and merge them into the master branch when you’re ready.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">If you’re working with others and want to make changes on your own, or if you’re working on your own and want to make changes without affecting the master branch, you want a separate branch.</span>&nbsp;You can create a new branch at any time.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">It’s also pretty simple to create a branch named “new_feature” in your terminal and switch to it with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p"><span class="markup--quote markup--pre-quote is-other">git checkout -b</span> new_feature</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">Once you create a branch, you can make changes on that branch.&nbsp;<span class="markup--quote markup--p-quote is-other">This makes it easy to see what you’ve changed and why you’ve changed it.</span>&nbsp;Every time you commit your changes, you’ll add a message that you can use to describe what you’ve done.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">Let’s talk about checkout!</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git checkout</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">lets you check out a repository that you’re not currently inside of.</span>&nbsp;You can check out the master branch with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git checkout master</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">or look at the “new_feature” branch with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git checkout new_feature</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">When you’re done with a branch, you can merge all of your changes back so that they’re visible to everyone.</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git merge new_feature</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">will take all of the changes you made to the “new_feature” branch and add them to the master.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">In order to create an upstream branch so that you can push your changes and set the remote branch as upstream, you will push your feature by running</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p"><span class="markup--quote markup--pre-quote is-other">git push --set-upstream origin new_feature</span></pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">After you make some changes and decide you like them, you open a pull request.</span>&nbsp;If you’re on a team, this is when other people on your team can start checking out your changes and discussing them.&nbsp;<span class="markup--quote markup--p-quote is-other">You can open a pull request at any point, whether it’s to have people look over your final changes or ask for help because you’re stuck on something.</span></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="e5a3">Ummmmm…what? Can I do that on the&nbsp;website?</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">You can!</p>



<p class="graf graf--p graf-after--figure wp-block-paragraph">One way to do this is simply by checking that button that we mentioned earlier when we were editing the README file. Super easy!</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">You can also create a new branch any time right on the website by going to your repository, clicking the drop-down menu near the left-middle side of your screen that says “Branch: master,” typing a branch name, and selecting the “Create branch” link (or hitting enter on your keyboard).</span>&nbsp;Now you have two branches that look the same! This is a great place to make changes and test them out before you want to make them affect the master branch.</p>



<figure class="wp-block-image graf graf--figure graf-after--p"><img data-recalc-dims="1" decoding="async" src="https://learnbasictech.files.wordpress.com/2019/03/44905-1vzzzakxwzgcv6nbswmtmja.png?w=1080" alt=""/><figcaption>Creating a&nbsp;branch</figcaption></figure>



<p class="graf graf--p graf-after--figure wp-block-paragraph">If you’re working on a separate branch, your changes only affect that branch.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">If you’re happy with your changes and you want to merge your changes to the master branch, you can open a&nbsp;<strong class="markup--strong markup--p-strong">pull request</strong>. This is how, if you were on a team, you would propose your changes and ask someone to review them or pull in your contribution and merge them into their branch.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">You can open a pull request as soon as you make a commit, even if you haven’t finished your code.</span>&nbsp;You can do this right on the website if you’re more comfortable with that. If you’ve made some changes on your branch and you want to merge them, you can</p>



<ul class="postList wp-block-list"><li id="cbc2" class="graf graf--li graf-after--p">Click the pull request tab near the top center of the screen</li></ul>



<ul class="postList wp-block-list"><li id="63e5" class="graf graf--li graf-after--li">Click the green “New pull request” button</li></ul>



<ul class="postList wp-block-list"><li id="0067" class="graf graf--li graf-after--li"><span class="markup--quote markup--li-quote is-other">Go to the “Example Comparisons” box and select the branch you made to compare with the original branch.</span></li></ul>



<ul class="postList wp-block-list"><li id="56d7" class="graf graf--li graf-after--li">Look over your changes to make sure they’re really what you want to commit.</li><li id="0274" class="graf graf--li graf-after--li">Then click the big green “Create pull request” button. Give it a title and write a brief description of your changes. Then click “Create Pull Request!”</li></ul>



<figure class="wp-block-image graf graf--figure graf-after--li"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/2b5ce-6b46f-1owhorw1cxw0sxp4m__sofa.png?w=1080" alt=""/><figcaption>New pull&nbsp;request</figcaption></figure>



<figure class="wp-block-image graf graf--figure graf-after--figure"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/daa0b-1edef-1bfsob2p0z2imajw_nuntna.png?w=1080" alt=""/><figcaption>Create pull&nbsp;request</figcaption></figure>



<p class="graf graf--p graf-after--figure wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Now if this is your repository, you can merge your pull request by clicking the green “Merge pull request” button to merge the changes into master.</span>&nbsp;Click “Confirm merge,”&nbsp;<span class="markup--quote markup--p-quote is-other">then delete the branch after your branch has been incorporated with the “Delete branch” button in the purple box.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">If you’re contributing to a project, people on the team (or the reviewer) might have questions or comments. If you need to change something, this is the time!&nbsp;<span class="markup--quote markup--p-quote is-other">If everything is good to go, they can deploy the changes right from the branch for final testing before you merge it.</span>&nbsp;And you can deploy your changes to verify them in production.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">If your changes have been verified, you can go ahead and merge your code into the master branch.&nbsp;<span class="markup--quote markup--p-quote is-other">The pull requests will preserve a record of your changes, which means that you can go through them any time to understand the changes and decisions that have been made.</span></p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="b266">Update and&nbsp;merge</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">If you’re working on your computer and want the most up-to-date version of a repository, you’d pull the changes down from GitHub with the&nbsp;<code class="markup--code markup--p-code">git pull</code>command.</span>&nbsp;To update your local repository to the newest commit, run</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git pull</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">in your working directory.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">To merge another branch into your active branch, use</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git merge &lt;branch_name&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">Git will try to auto-merge changes, but this isn’t always possible.</span>&nbsp;Conflicts might arise. If they do,&nbsp;<span class="markup--quote markup--p-quote is-other">you’ll need to merge the conflicts manually.</span>&nbsp;After changing them,&nbsp;<span class="markup--quote markup--p-quote is-other">you can mark them as merged with&nbsp;<code class="markup--code markup--p-code">git add &lt;filename&gt;</code>.</span>&nbsp;You can preview your changes before you merge them with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git diff &lt;source_branch&gt; &lt;target_branch&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">You can switch back to to the master branch with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git checkout master</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">You’ll make your changes and then delete the branch when you’re done with</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git branch -d new_feature</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">This branch isn’t available to anyone else unless you push the branch to your remote repository with</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git push origin &lt;branch&gt;
</pre>



<h3 class="graf graf--h4 graf-after--pre wp-block-heading" id="534e">Other helpful&nbsp;commands</h3>



<p class="graf graf--p graf-after--h4 wp-block-paragraph">First of all, this is my favorite&nbsp;<a class="markup--anchor markup--p-anchor" href="https://education.github.com/git-cheat-sheet-education.pdf" target="_blank" rel="noopener noreferrer">GitHub cheatsheet</a>. Check it out for all of the most useful Git commands!</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">You can see the commit history of the repository if you run</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git log</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">You can see one person’s commits with</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git log --author=&lt;name&gt;</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph"><span class="markup--quote markup--p-quote is-other">You can see what has been changed but not staged yet with</span></p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git diff</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">Need help remembering what command you’re supposed to run? Try</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git help</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">to see the 21 most common commands. You can also type something like</p>



<pre class="wp-block-preformatted graf graf--pre graf-after--p">git help clone</pre>



<p class="graf graf--p graf-after--pre wp-block-paragraph">to figure out how to use a specific command like “clone.”</p>



<p class="wp-block-paragraph"></p>



<h3 class="graf graf--h4 graf-after--p wp-block-heading" id="985e">Let’s do&nbsp;this!</h3>



<figure class="wp-block-image alignnone progressiveMedia-image js-progressiveMedia-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*PtLPaJPif39ceAtW" alt=""/><figcaption>Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@mervynckw?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Mervyn Chan</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>



<p class="graf graf--p graf-after--figure wp-block-paragraph">Why not leave your mark and welcome everyone who’s here to learn about Git and GitHub?&nbsp;<span class="markup--quote markup--p-quote is-other">We’re going to create a simple welcome wall with notes from everyone who wants to try out Git and GitHub and contribute to their first open-source project.</span></p>



<p class="graf graf--p graf-after--p wp-block-paragraph">You can add whatever you want to the welcome wall, as long as you keep it warm and encouraging. Add a note, add an image, whatever. Make our little world better in whatever way makes you happy. (If you’re an overthinker (I see you ❤️), I have a pre-written message in the README file that you can just copy and paste.)</p>



<ul class="postList wp-block-list"><li id="e7fc" class="graf graf--li graf-after--p"><a class="markup--anchor markup--li-anchor" href="https://github.com/bonn0062/github_welcome_wall" target="_blank" rel="noopener noreferrer">Clone the repository</a>, either on the GitHub website or by running</li></ul>



<pre class="wp-block-preformatted graf graf--pre graf-after--li">git clone <a class="markup--anchor markup--pre-anchor" href="https://github.com/bonn0062/github_welcome_wall.git" target="_blank" rel="nofollow noopener noreferrer">https://github.com/bonn0062/github_welcome_wall.git</a></pre>



<ul class="postList wp-block-list"><li id="c63a" class="graf graf--li graf-after--pre">Create a new branch and add a welcoming and encouraging thought to the “welcome_wall.md” file. You can do this on the website, but I really encourage you to try cloning the repository to your computer, opening the file with your favorite text editor, and adding your message there. It’s just good learning!</li></ul>



<ul class="postList wp-block-list"><li id="1a8d" class="graf graf--li graf-after--li"><a class="markup--anchor markup--li-anchor" href="https://github.com/bonn0062/github_welcome_wall.git" target="_blank" rel="noopener noreferrer">Create a pull request</a>.</li><li id="0885" class="graf graf--li graf-after--li">Write a quick note describing your change and click the green button to create your pull request.</li></ul>



<p class="graf graf--p graf-after--li wp-block-paragraph">That’s it! If it’s a decent message, thought, image, or idea, I’ll merge your request and you will have successfully contributed to an open-source project.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph"><strong>Congratulations!!! You did it!</strong></p>



<p class="graf graf--p graf-after--figure wp-block-paragraph">As always, if you do anything awesome with this information, I’d love to hear about it! Leave a message in the responses section or reach out any time on Twitter&nbsp;<a class="markup--anchor markup--p-anchor" href="https://twitter.com/annebonnerdata" target="_blank" rel="noopener noreferrer">@annebonnerdata</a>.</p>



<p class="graf graf--p graf-after--p wp-block-paragraph">If you liked this article, you might want to check out:</p>



<ul class="postList wp-block-list"><li><a rel="noreferrer noopener" aria-label="Getting Started with Google Colab: a simple tutorial for the frustrated and confused (opens in a new tab)" href="https://towardsdatascience.com/getting-started-with-google-colab-f2fff97f594c" target="_blank">Getting Started with Google Colab: a simple tutorial for the frustrated and confused</a></li><li><a href="https://contentsimplicity.com/how-to-create-a-free-portfolio/">How to Create a Totally Free Website, Portfolio, or Blog</a></li><li><a class="markup--anchor markup--li-anchor" rel="noopener noreferrer" href="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d" target="_blank">The complete beginner’s guide to data cleaning and preprocessing</a></li></ul>



<p class="graf graf--p graf-after--li graf--trailing wp-block-paragraph">Thanks for reading! ❤️</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/">Getting started with Git and GitHub: the complete beginner’s guide</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">230</post-id>	</item>
		<item>
		<title>The brilliant beginner’s guide to model deployment</title>
		<link>https://contentsimplicity.com/the-brilliant-beginners-guide-to-model-deployment/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-brilliant-beginners-guide-to-model-deployment&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-brilliant-beginners-guide-to-model-deployment</link>
					<comments>https://contentsimplicity.com/the-brilliant-beginners-guide-to-model-deployment/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Fri, 04 Sep 2020 04:37:49 +0000</pubDate>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[flask]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[model deployment]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[pytorch]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=146</guid>

					<description><![CDATA[<p>The brilliant beginner’s guide to model deployment: a clear and simple roadmap for getting your machine learning model on the Internet and doing something cool</p>
<p>The post <a href="https://contentsimplicity.com/the-brilliant-beginners-guide-to-model-deployment/">The brilliant beginner’s guide to model deployment</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[<section class="section section--body section--first">
<div class="section-content">
<div class="section-inner sectionLayout--fullWidth">
<figure id="aaef" class="graf graf--figure graf--layoutFillWidth graf-after--h4">
<div><em><a href="https://heartbeat.fritz.ai/brilliant-beginners-guide-to-model-deployment-133e158f6717" target="_blank" rel="noopener noreferrer">This article first appeared in Heartbeat by Fritz</a></em></div>
<div></div>
<div class="aspectRatioPlaceholder is-locked">
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/0ddce-f4e16-1cqcerml1b66djze-rjhi-a.jpeg?w=1080" /></div>
</div>
</figure>
</div>
<div class="section-inner sectionLayout--insetColumn">
<p id="e3a1" class="graf graf--p graf-after--figure">You built this amazing machine learning model—<a class="markup--anchor markup--p-anchor" href="https://medium.freecodecamp.org/how-to-build-the-best-image-classifier-3c72010b3d55" target="_blank" rel="noopener noreferrer">this one</a>, let’s say—but now what?</p>
<p id="891c" class="graf graf--p graf-after--p">How do you take your model and turn it into something that you can display on the web? How do you turn it into something that other people can interact with? How do you make it useful?</p>
<p id="a1cc" class="graf graf--p graf-after--p">You deploy it!</p>
<figure id="e419" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*6yIiOkpM9pULB9i0" /></div>
</div><figcaption class="imageCaption">Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@brazofuerte?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Collin Armstrong</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></figcaption></figure>
<p id="60cf" class="graf graf--p graf-after--figure">Having the knowledge and ability to deploy your machine learning model is an absolute necessity. Whether you’re building a model or generating reports, you need this skill. It takes that model that you poured your blood, sweat, and tears into and turns it into something that absolutely anyone can play with and admire.</p>
<p id="bc02" class="graf graf--p graf-after--p">This article will walk you through the basics of deploying a machine learning model. We&#8217;re going to <span class="markup--quote markup--p-quote is-other">deploy a <a class="markup--anchor markup--p-anchor" href="https://heartbeat.fritz.ai/introduction-to-pytorch-for-deep-learning-5b437cea90ac" target="_blank" rel="noopener noreferrer">PyTorch</a><a class="markup--anchor markup--p-anchor" href="https://heartbeat.fritz.ai/basics-of-image-classification-with-pytorch-2f8973c51864" target="_blank" rel="noopener noreferrer">image classifier</a> with Flask</span>. This is the first critical step towards turning your model into an app.</p>
<p id="ea5e" class="graf graf--p graf-after--p">By the end of this article, you’ll be able to take a PyTorch image classifier and turn it into a cool web app. In this app, users will be able to upload an image of a flower to see what kind of flower it is.</p>
<p id="e843" class="graf graf--p graf-after--p">Your deep learning image classifier will now be an awesome image prediction app.</p>
<figure id="69c1" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/63e59-727c1-1yl8sflizugdqqmvcgbtaoa.png?w=1080" /></div>
</div>
</figure>
<h4 id="b100" class="graf graf--h4 graf-after--figure">Let’s get started!</h4>
<p id="024a" class="graf graf--p graf-after--h4">First, we should hit the basics. (You can find <a class="markup--anchor markup--p-anchor" href="http://flask.pocoo.org/docs/1.0/installation/#installation" target="_blank" rel="noopener noreferrer">the official installation guide</a> here if you want to take a look!)</p>
<p id="9b50" class="graf graf--p graf-after--p">It’s a good idea to set up a virtual environment to manage the dependencies of your project. You can do that by setting up a folder for your project, then going to your terminal and running:</p>
<pre id="5875" class="graf graf--pre graf-after--p">mkdir myproject</pre>
<pre id="7757" class="graf graf--pre graf-after--pre">cd myproject</pre>
<pre id="dde1" class="graf graf--pre graf-after--pre">python3 -m venv venv</pre>
<blockquote id="61ec" class="graf graf--blockquote graf-after--pre"><p>I should let you know now that everything that I’m going to do here works on a Mac with Python 3. If you’re working on Windows or running Python 2, you might want to head on over to the official documentation to see what you might need to tweak to get up and running.</p></blockquote>
<p id="dae8" class="graf graf--p graf-after--blockquote">Next, activate your environment.</p>
<pre id="6205" class="graf graf--pre graf-after--p">. venv/bin/activate</pre>
<p id="25db" class="graf graf--p graf-after--pre">Now we can install Flask.</p>
<pre id="9a91" class="graf graf--pre graf-after--p">pip install Flask</pre>
<p id="48da" class="graf graf--p graf-after--pre">You’re ready to go!</p>
<p id="3d00" class="graf graf--p graf-after--figure">The <a class="markup--anchor markup--p-anchor" href="http://flask.pocoo.org/docs/1.0/quickstart/#rendering-templates" target="_blank" rel="noopener noreferrer">quickstart guide</a> is a really helpful document to check out if you’re interested in learning a bit more about the basics. I’m going to start you out with a little information that’s very similar to the information provided in that guide. There isn’t a better or clearer explanation of the basics of Flask than that one.</p>
<p id="f8dd" class="graf graf--p graf-after--p">To <span class="markup--quote markup--p-quote is-other">create a seriously minimal Flask application, you start by creating a file.</span></p>
<p id="453d" class="graf graf--p graf-after--p">Create the file and open it in your favorite text editor. Then type</p>
<pre id="e3ad" class="graf graf--pre graf-after--p"><strong class="markup--strong markup--pre-strong">from</strong> flask <strong class="markup--strong markup--pre-strong">import</strong> Flask
app = Flask<strong class="markup--strong markup--pre-strong">(</strong>__name__<strong class="markup--strong markup--pre-strong">)</strong>

@app.route<strong class="markup--strong markup--pre-strong">(</strong>'/'<strong class="markup--strong markup--pre-strong">)</strong>
<strong class="markup--strong markup--pre-strong">def</strong> hello_world<strong class="markup--strong markup--pre-strong">():</strong>
    <strong class="markup--strong markup--pre-strong">return</strong> 'Hello, there!'</pre>
<p id="6ed6" class="graf graf--p graf-after--pre">What does the code above do?</p>
<p id="14ce" class="graf graf--p graf-after--p">First of all, we imported the Flask class. <span class="markup--quote markup--p-quote is-other">Next, we created an instance of the class.</span> The first argument is the name of the application’s module. If you’re using a single module, you’ll use <code class="markup--code markup--p-code">(__name__)</code> so that Flask knows where to look for stuff. The “route” part tells Flask what URL is supposed to trigger our function. We give the function a name that’s also used to generate URLs for that function and returns the message we want to display in the user’s browser.</p>
<p id="49c1" class="graf graf--p graf-after--p">You can save this as hello.py or whatever.py or anything else that makes you happy. Just <strong class="markup--strong markup--p-strong">don’t save it as flask.py</strong> because that will conflict with Flask. I like to go with app.py for the main flask file because that’s going to be what Flask wants to find later.</p>
<p id="0b66" class="graf graf--p graf-after--p">If you want to run it, go to your terminal and type</p>
<pre id="6493" class="graf graf--pre graf-after--p">export FLASK_APP=app.py</pre>
<p id="7ca3" class="graf graf--p graf-after--pre">and then</p>
<pre id="8380" class="graf graf--pre graf-after--p">flask run</pre>
<p id="b9dd" class="graf graf--p graf-after--pre">If everything’s working, you’ll see something like this</p>
<pre id="dc4b" class="graf graf--pre graf-after--p">Running on http://127.0.0.1:5000/</pre>
<p id="40c9" class="graf graf--p graf-after--pre">Now you can click (command-click) on that web address or copy and paste it into your browser. See if it works!</p>
<figure id="067b" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/10cdd-f95eb-1eqlhf-iwfqrb6miid1sthq.png?w=1080" /></div>
</div>
</figure>
<p id="b9e5" class="graf graf--p graf-after--figure">(Any time you want to shut it down, just type control-C in your terminal window.)</p>
<p id="ca24" class="graf graf--p graf-after--p">Now, here’s the thing I really like to run when I’m trying to create something in Flask:</p>
<pre id="1d6f" class="graf graf--pre graf-after--p">export FLASK_ENV=development</pre>
<p id="56db" class="graf graf--p graf-after--pre">I run that command before I run <code class="markup--code markup--p-code">flask run</code>. This puts you in development mode. <span class="markup--quote markup--p-quote is-other">That means that instead of having to do a manual restart every single time you make a change to your code, your server will reload itself when you change your code.</span> It will also provide you with a seriously helpful debugger when things go wrong!</p>
<figure id="43d0" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/3bf10-fb6c8-1raoi5xx4bg2o7ub6sb9shw.jpeg?w=1080" /></div>
</div><figcaption class="imageCaption">Image by Miryams-Fotos on <a class="markup--anchor markup--figure-anchor" href="http://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></figcaption></figure>
<p id="b50c" class="graf graf--p graf-after--figure">That being said, <strong class="markup--strong markup--p-strong">putting flask into development mode presents a major security risk</strong>, so you never, ever, ever want to use it on production machines.</p>
<p id="c6de" class="graf graf--p graf-after--p">The <a class="markup--anchor markup--p-anchor" href="http://flask.pocoo.org/docs/1.0/quickstart/#rendering-templates" target="_blank" rel="noopener noreferrer">quickstart guide</a> also tells you how to bind functions to meaningful URLs. That makes it easier for people to come back to your web app, how to create unique URLs, how to render templates, and more! It walks you through how to read and store cookies, how to upload files, and how to set up redirects and errors. Check it out if you’re looking for more of the basics.</p>
<figure id="86a9" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/c41bb-4c9e8-1wkuupeyn6rjfkgpfajnfqa.jpeg?w=1080" /></div>
</div><figcaption class="imageCaption">Image by DariuszSanowski on <a class="markup--anchor markup--figure-anchor" href="http://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></figcaption></figure>
<h4 id="26c2" class="graf graf--h4 graf-after--figure">On to our project!</h4>
<p id="17ea" class="graf graf--p graf-after--h4">You’ll want to begin with the imports, so go to your terminal and run</p>
<pre id="3883" class="graf graf--pre graf-after--p">import flask
import torch 
import gunicorn 
import PIL</pre>
<p id="9b20" class="graf graf--p graf-after--pre">We’ll make a folder for this project and work within it. (If you didn’t create the folder and file earlier, do that now.) Create a folder for this project, navigate to your folder in the terminal and run the commands below one line at a time. Copy the app.py code from the example above and put it in the app.py file if you want to make sure your new web app is working.</p>
<pre id="4ee3" class="graf graf--pre graf-after--p">sublime app.py
python app.py
flask run</pre>
<p id="90bc" class="graf graf--p graf-after--pre">(The command “sublime app.py” below will only work if you want to work in <a class="markup--anchor markup--p-anchor" href="https://www.sublimetext.com/" target="_blank" rel="noopener noreferrer">Sublime</a> and have the shortcut set up. You can skip that and just go to your preferred text editor and create a new file called “app.py” if you prefer another text editor.)</p>
<p id="feca" class="graf graf--p graf-after--p">You can command-click on the link that shows up, just like we did earlier, or copy and paste it into your browser.</p>
<p id="dcbe" class="graf graf--p graf-after--p">You don’t want to just throw everything up in a string, so create a folder called “templates.” In “templates,” create one file called “index.html” and one file called “result.html.”</p>
<p id="4523" class="graf graf--p graf-after--p">Open up index.html in your text editor and set up an HTML template. If you’re using Sublime, you can type html &lt;TAB&gt; to create a basic HTML template.</p>
<p id="eb14" class="graf graf--p graf-after--p">Put the name of your project in the title and add “hello, there” between &lt;p&gt; and &lt;/p&gt; in the body section.</p>
<figure id="49bf" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/f637e-7761e-1ief_6fmviryutmsr7slmya.png?w=1080" /></div>
</div>
</figure>
<p id="f1f0" class="graf graf--p graf-after--figure">Go back to your app.py file, add <code class="markup--code markup--p-code">render_template</code>to the first line and replace “Hello, there!” with <code class="markup--code markup--p-code">render_template(‘index.html’)</code></p>
<pre id="54de" class="graf graf--pre graf-after--p">from flask import Flask, request, render_template
app = Flask(__name__)</pre>
<pre id="d7ba" class="graf graf--pre graf-after--pre"><a class="markup--anchor markup--pre-anchor" title="Twitter profile for @app" href="http://twitter.com/app" target="_blank" rel="noopener noreferrer">@app</a>.route('/')
def hello_world():
    return render_template('index.html')</pre>
<p id="e974" class="graf graf--p graf-after--pre">Flask will take a look at app.py, then reach into your templates folder and pull up index.html, which we have set to display “Hello, there!” If you restart your page, you’ll see this</p>
<figure id="87cb" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/10cdd-f95eb-1eqlhf-iwfqrb6miid1sthq.png?w=1080" /></div>
</div>
</figure>
<p id="8671" class="graf graf--p graf-after--figure">You can tell flask to restart whenever we save our changes by adding</p>
<pre id="5ff4" class="graf graf--pre graf-after--p">if __name__ == '__main__':
 app.run(debug=True)</pre>
<p id="0975" class="graf graf--p graf-after--pre">to the bottom of the app.py page to work in debug mode.</p>
<p id="a0ed" class="graf graf--p graf-after--p">You can easily pass in values by changing your <strong class="markup--strong markup--p-strong">app.py file</strong> to something like this</p>
<pre id="408e" class="graf graf--pre graf-after--p">from flask import Flask, request, render_template
app = Flask(__name__)</pre>
<pre id="38f8" class="graf graf--pre graf-after--pre"><a class="markup--anchor markup--pre-anchor" title="Twitter profile for @app" href="http://twitter.com/app" target="_blank" rel="noopener noreferrer">@app</a>.route('/')
def hello_world():
 return render_template('index.html', value='hi')</pre>
<pre id="d1f1" class="graf graf--pre graf-after--pre">if __name__ == '__main__':
 app.run(debug=True)</pre>
<p id="0366" class="graf graf--p graf-after--pre">and your <strong class="markup--strong markup--p-strong">index.html</strong> file to this</p>
<pre id="bcc3" class="graf graf--pre graf-after--p">&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;
 &lt;title&gt;Flower Classifier&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
 &lt;p&gt;&lt;p&gt;Hi there! {{ value }}&lt;/p&gt;&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;</pre>
<p id="b7dc" class="graf graf--p graf-after--pre">We’re up and running! However, if we want to build a web app that will allow users to upload a file or an image and display the results, we’ll want to build an app that accepts both “GET” and “POST” methods.</p>
<p id="56f5" class="graf graf--p graf-after--p">To do that, we change our <strong class="markup--strong markup--p-strong">app.py</strong> file to</p>
<pre id="60fd" class="graf graf--pre graf-after--p">from flask import Flask, request, render_template
app = Flask(__name__)</pre>
<pre id="90dc" class="graf graf--pre graf-after--pre"><a class="markup--anchor markup--pre-anchor" title="Twitter profile for @app" href="http://twitter.com/app" target="_blank" rel="noopener noreferrer">@app</a>.route('/', methods=['GET', 'POST'])
def hello_world():
 if request.method == 'GET':
  return render_template('index.html', value='hi')
 if request.method == 'POST':
  return render_template('result.html')</pre>
<pre id="9de6" class="graf graf--pre graf-after--pre">if __name__ == '__main__':
    app.run(debug=True)</pre>
<p id="a4a7" class="graf graf--p graf-after--pre">and we’ll change our <strong class="markup--strong markup--p-strong">index.html</strong> file to</p>
<pre id="739d" class="graf graf--pre graf-after--p">&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;
 &lt;title&gt;Flower App&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
 &lt;h2&gt;Upload your flower image&lt;/h2&gt;
 &lt;form method ='post' enctype=multipart/form-data&gt;
  &lt;input type="file" name="file"&gt;
  &lt;input type="submit" value="upload"&gt;
&lt;/body&gt;
&lt;/html&gt;</pre>
<figure id="41b9" class="graf graf--figure graf-after--pre">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/85da7-96e89-1tpcz7olcvexohvvaqvjgtq.png?w=1080" /></div>
</div>
</figure>
<p id="f39c" class="graf graf--p graf-after--figure">Remember that <strong class="markup--strong markup--p-strong">result.html</strong> file we created? Now you want to have something in there, so open up that file and add</p>
<pre id="86c0" class="graf graf--pre graf-after--p">&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;
 &lt;title&gt;Flower App&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
 &lt;h2&gt;Prediction&lt;/h2&gt;
 &lt;p&gt;Flower Name: Lily&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;</pre>
<p id="75fc" class="graf graf--p graf-after--pre">Now you should be able to reload your browser window (if you’re running in debug mode), upload an image, and see this as your result</p>
<figure id="c3b8" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/54b09-d67ee-1dvjlsgbeyredajp1ldhokg.png?w=1080" /></div>
</div>
</figure>
<p id="c3bc" class="graf graf--p graf-after--figure">Right now, our results page will just say that your image is a lily no matter what you upload.</p>
<p id="8911" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Congratulations if you’ve gotten this far!</strong></p>
<p id="6363" class="graf graf--p graf-after--figure">You’re going to want to be able to render the results, and doing that is incredibly simple. If you want to test it out, change your <strong class="markup--strong markup--p-strong">app.py</strong> file so that you render your results this way: <code class="markup--code markup--p-code">return render_template('result.html", flower=flower_name)</code> (you’re just adding the second part). Next, replace the “Flower Name” line in your <strong class="markup--strong markup--p-strong">results.html</strong> file to read <code class="markup--code markup--p-code">&lt;p&gt;Flower Name: {{ flower}} &lt;/p&gt;</code> .</p>
<p id="5927" class="graf graf--p graf-after--p">Now you’re going to create some inference! I’m going to assume that you have <a class="markup--anchor markup--p-anchor" href="https://medium.freecodecamp.org/how-to-build-the-best-image-classifier-3c72010b3d55" target="_blank" rel="noopener noreferrer">an image classifier created using PyTorch</a> with a saved checkpoint. You’ll need that to actually make this work! Put that checkpoint in your project folder.</p>
<p id="de3c" class="graf graf--p graf-after--p">If you don’t have a checkpoint file, <a class="markup--anchor markup--p-anchor" href="https://medium.freecodecamp.org/how-to-build-the-best-image-classifier-3c72010b3d55" target="_blank" rel="noopener noreferrer">check out this article on creating a seriously accurate image classifier in PyTorch</a>. It gives you all of the code you need to create an image classifier and create that checkpoint.</p>
<p id="9c6b" class="graf graf--p graf-after--p">Now we need to write a way to grab the image and send the info to the template. First, you’ll need a function to get the model and create your prediction. Create a <strong class="markup--strong markup--p-strong">commons.py</strong> file and write a function to get the model as well as something that will allow you to convert the uploaded file into a tensor. Try this!</p>
</div>
</div>
</section>
<p>&nbsp;</p>
<p>&#8220;`</p>
<section class="section section--body section--first">
<div class="section-content">
<div class="section-inner sectionLayout--insetColumn">Next, create an <strong class="markup--strong markup--p-strong">inference.py</strong> file. You need to be able to sort out the flower names, classes, and labels, so you can write something like this:</p>
<figure id="34d6" class="graf graf--figure graf--iframe graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
</div>
</figure>
<p id="cd4d" class="graf graf--p graf-after--figure">Update your <strong class="markup--strong markup--p-strong">app.py</strong> file so that it reads</p>
<figure id="3465" class="graf graf--figure graf--iframe graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
</div>
</figure>
<p id="1500" class="graf graf--p graf-after--figure">(If you’re paying attention, you’ll see that I added a couple of lines in the code above to make sure that you’ll get an error message if your file wasn’t uploaded.)</p>
<p id="71bc" class="graf graf--p graf-after--p">Make sure your <strong class="markup--strong markup--p-strong">result.html</strong> file reads something like this:</p>
<figure id="b1a6" class="graf graf--figure graf--iframe graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
</div>
</figure>
<p id="29eb" class="graf graf--p graf-after--figure">and you should be able to upload an image and get a result!</p>
<figure id="ae7c" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/9b53f-8cabe-1cjl0_fnpb3tekeyr6mvhvq.png?w=1080" /></div>
</div><figcaption class="imageCaption">Image upload</figcaption></figure>
<figure id="3003" class="graf graf--figure graf-after--figure">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/9e9ec-6976a-1jq_y5su8cnpvuuvm-bhscg.png?w=1080" /></div>
</div><figcaption class="imageCaption">Results!</figcaption></figure>
<p id="df38" class="graf graf--p graf-after--figure">That’s it!</p>
<p id="5bf9" class="graf graf--p graf-after--p">You now have a working web application built on your image classifier that can upload an image of a flower and predict its species!</p>
<p id="8808" class="graf graf--p graf-after--figure">Now it’s up to you to refine your classifier and model. You can figure out how to make your classifier faster and more accurate. (Looking for ways to finetune your model? Check out the <a class="markup--anchor markup--p-anchor" href="https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html" target="_blank" rel="noopener noreferrer">official tutorial first</a>! After that, check out <a class="markup--anchor markup--p-anchor" href="https://mc.ai/ideas-on-how-to-fine-tune-a-pre-trained-model-in-pytorch/" target="_blank" rel="noopener noreferrer">this article by Florin Cioloboc and Harisyam Manda</a>. It’s full of great suggestions.) You may want to add code that can let people know if they’ve uploaded the wrong kind of file. You may decide you want people to see the top five species results or the probability that their flower is, in fact, the species that your classifier predicted. What you do from here is up to you!</p>
<p id="27a4" class="graf graf--p graf-after--p">…you might also want to make this thing look a little sexier.</p>
<p id="ca78" class="graf graf--p graf-after--p">By taking three minutes to insert a little CSS in my <strong class="markup--strong markup--p-strong">index.html</strong> file plus an image in a separate folder,</p>
</div>
<div class="aspectRatioPlaceholder-fill"></div>
<div class="section-inner sectionLayout--insetColumn">
<p id="6807" class="graf graf--p graf-after--figure">I went from this</p>
<figure id="93af" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/85da7-96e89-1tpcz7olcvexohvvaqvjgtq.png?w=1080" /></div>
</div>
</figure>
<p id="ab94" class="graf graf--p graf-after--figure">to this!</p>
<figure id="64f2" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/63e59-727c1-1yl8sflizugdqqmvcgbtaoa.png?w=1080" /></div>
</div>
</figure>
<p id="f480" class="graf graf--p graf-after--figure">This is just the most basic example of how to deploy a PyTorch image classifier to Flask. You can do anything from here!</p>
<p id="a1d4" class="graf graf--p graf-after--p">Our next step will be to turn this baby into an app, so stay tuned. Also, if you want to take a look at this code and the folder structure, you’re welcome to check out <a class="markup--anchor markup--p-anchor" href="https://github.com/bonn0062/flask_model_deployment" target="_blank" rel="noopener noreferrer">this basic model deployment GitHub repo</a>.</p>
<p id="8126" class="graf graf--p graf-after--p graf--trailing">As always, if you create anything awesome, please share it in the responses below or reach out any time on Twitter <a class="markup--anchor markup--p-anchor" href="https://twitter.com/annebonnerdata" target="_blank" rel="noopener noreferrer">@annebonnerdata</a>!</p>
</div>
</div>
</section>
<section class="section section--body section--last">
<div class="section-divider">
<hr class="section-divider" />
</div>
<div class="section-content">
<div class="section-inner sectionLayout--insetColumn"></div>
</div>
</section>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/the-brilliant-beginners-guide-to-model-deployment/">The brilliant beginner’s guide to model deployment</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/the-brilliant-beginners-guide-to-model-deployment/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">214</post-id>	</item>
		<item>
		<title>Get Involved With SciPy</title>
		<link>https://contentsimplicity.com/get-involved-with-scipy/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=get-involved-with-scipy&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=get-involved-with-scipy</link>
					<comments>https://contentsimplicity.com/get-involved-with-scipy/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Fri, 27 Sep 2019 12:37:00 +0000</pubDate>
				<category><![CDATA[NumPY]]></category>
		<category><![CDATA[SciPy]]></category>
		<category><![CDATA[Season of Docs]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[technical documentation]]></category>
		<category><![CDATA[technical writing]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=1027</guid>

					<description><![CDATA[<p>You have heard of SciPy. Have you given any thought to getting involved with SciPy to let them know how they can improve their documentation?</p>
<p>The post <a href="https://contentsimplicity.com/get-involved-with-scipy/">Get Involved With SciPy</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h4 class="wp-block-heading">SciPy wants your thoughts on its technical documentation and user guides</h4>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">You’ve heard of SciPy.</p>



<p class="wp-block-paragraph">You’ve probably used it.</p>



<p class="wp-block-paragraph">You might have even looked through some of the technical documentation and user guides. You might even have an opinion of the documentation…</p>



<p class="wp-block-paragraph">But have you given any thought to getting involved with SciPy and letting them know how they can improve their documentation? Telling SciPy what you like and what you don’t like or how you think the documentation can be improved?</p>



<p class="wp-block-paragraph">Now’s your chance!</p>



<h3 class="wp-block-heading">What is SciPy?</h3>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">It’s scientific (<strong>Sci</strong>) Python (<strong>Py)</strong>! <a href="https://www.scipy.org/" rel="noreferrer noopener" target="_blank">SciPy</a> is a free and open-source Python library. It’s used for scientific computing and technical computing. It contains modules for <a href="https://en.wikipedia.org/wiki/Optimization_%28mathematics%29" rel="noreferrer noopener" target="_blank">optimization</a>, <a href="https://en.wikipedia.org/wiki/Linear_algebra" rel="noreferrer noopener" target="_blank">linear algebra</a>, <a href="https://en.wikipedia.org/wiki/Integral" rel="noreferrer noopener" target="_blank">integration</a>, <a href="https://en.wikipedia.org/wiki/Interpolation" rel="noreferrer noopener" target="_blank">interpolation</a>, <a href="https://en.wikipedia.org/wiki/Special_functions" rel="noreferrer noopener" target="_blank">special functions</a>, <a href="https://en.wikipedia.org/wiki/Fast_Fourier_transform" rel="noreferrer noopener" target="_blank">FFT</a>, <a href="https://en.wikipedia.org/wiki/Signal_processing" rel="noreferrer noopener" target="_blank">signal</a> and <a href="https://en.wikipedia.org/wiki/Image_processing" rel="noreferrer noopener" target="_blank">image processing</a>, <a href="https://en.wikipedia.org/wiki/Ordinary_differential_equation" rel="noreferrer noopener" target="_blank">ODE</a> solvers and other tasks common in science and engineering.</p>



<p class="wp-block-paragraph">SciPy uses NumPy arrays as the basic data structure. It has modules for various commonly used tasks in scientific programming. These tasks include integration (calculus), ordinary differential equation solving, and signal processing.SciPy builds on the <a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/NumPy" target="_blank">NumPy</a> array object. It’s part of the NumPy stack. The stack includes tools like <a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/Matplotlib" target="_blank">Matplotlib</a>, <a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/Pandas_%28software%29" target="_blank">Pandas</a>, and <a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/SymPy" target="_blank">SymPy</a>, and an expanding set of scientific computing libraries.</p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">How can you get involved?</h3>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><a href="http:// https://forms.gle/eK3x2ohJs1sLPJEk8" target="_blank" rel="noreferrer noopener" aria-label="Take a quick survey! (opens in a new tab)">Take a quick survey!</a></p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">While I’m over at NumPy working on creating a section in the technical documentation aimed at beginners, Maja Gwozdz is hard at work in the SciPy docs. She’s combing through the SciPy documentation to create something that’s even more helpful for you. She’s reaching out the whole community (that’s you!) to find out what you like and don’t like, and she would love your input!</p>



<p class="wp-block-paragraph">As Maja wrote in her proposal for Google Season of Docs:</p>



<p class="wp-block-paragraph"></p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>“I intend to work on the refactoring of the existing documentation so that it would be easily accessible by users with different needs. It goes without saying that a researcher is most likely interested in advanced and subtle features, whereas a user without prior expertise appreciates step-by-step guides and diagrams.</p></blockquote>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>I am interested in pursuing this project for personal and professional reasons: first of all, I would like to contribute significantly to SciPy because my own research has greatly benefited from it and secondly, I encounter insufficient (or lacking) documentation all too often in other software and always wonder how much faster (if it all!) users could learn how to use the code had they been provided with a thorough guide.”</p></blockquote>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Maja put together a survey for everyone in the community who wants to be heard. This is an amazing opportunity to raise your hand and get involved. <a rel="noreferrer noopener" href="https://forms.gle/eK3x2ohJs1sLPJEk8" target="_blank">You can find the survey here</a> and it’s designed to let you give as much of your time and input as you feel like giving.</p>



<p class="wp-block-paragraph">The questions are very straightforward and most of them have simple multiple-choice answers. You’ll answer questions like, “What parts of the documentation do you use?” and “Which of the documentation features should be improved/added?” Below the multiple-choice questions, you can add your own comments and suggestions.</p>



<p class="wp-block-paragraph">It’s quick, it’s easy, and it’s incredibly helpful. If you’ve used SciPy and the SciPy documentation, Maja would love to hear from you. It is time to get involved with SciPy.</p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://forms.gle/eK3x2ohJs1sLPJEk8" target="_blank">Take a minute or two to speak up and be heard!</a></p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2ASisuc4EUvpQBtYRjCbdgAg.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a href="https://www.pexels.com/@whitegold4tography?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">White Gold Photography </a>from&nbsp;<a href="https://www.pexels.com/photo/photo-of-woman-singing-2345342/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pexels</a></figcaption></figure>



<hr class="wp-block-separator"/>



<p class="wp-block-paragraph"><em>Featured photo by <a rel="noreferrer noopener" href="https://www.pexels.com/@lum3n-com-44775?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank">Lum3n.com </a>from <a rel="noreferrer noopener" href="https://www.pexels.com/photo/adorable-blur-breed-close-up-406014/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank">Pexels</a></em></p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/get-involved-with-scipy/">Get Involved With SciPy</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/get-involved-with-scipy/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1027</post-id>	</item>
		<item>
		<title>NumPy and SciPy and Google Season of Docs, Oh My: Meet Maja Gwózdz</title>
		<link>https://contentsimplicity.com/scipy-meet-maja-gwozdz/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=scipy-meet-maja-gwozdz&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=scipy-meet-maja-gwozdz</link>
					<comments>https://contentsimplicity.com/scipy-meet-maja-gwozdz/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Wed, 25 Sep 2019 12:21:00 +0000</pubDate>
				<category><![CDATA[NumPY]]></category>
		<category><![CDATA[SciPy]]></category>
		<category><![CDATA[Season of Docs]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[technical documentation]]></category>
		<category><![CDATA[technical writing]]></category>
		<category><![CDATA[writer]]></category>
		<category><![CDATA[writing]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=1020</guid>

					<description><![CDATA[<p>Go behind the scenes to meet the people and learn about some of the work we’re doing right now with the technical documentation at NumPy and SciPy.</p>
<p>The post <a href="https://contentsimplicity.com/scipy-meet-maja-gwozdz/">NumPy and SciPy and Google Season of Docs, Oh My: Meet Maja Gwózdz</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Learn more about the technical writers paired with NumPy and SciPy during Google Season of Docs</h2>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><em>Welcome! From September through November, our little corner of the open-source world is going to involve technical documentation updates at NumPy and SciPy!</em></p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">A behind-the-scenes tour</h3>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">You get to go behind the scenes to meet the people and learn about some of the work we’re doing right now with the technical documentation at <a rel="noreferrer noopener" href="https://numpy.org" target="_blank">NumPy</a> and <a rel="noreferrer noopener" href="https://scipy.org" target="_blank">SciPy</a>.</p>



<p class="wp-block-paragraph">A few weeks ago, I told you I would let you know more about the behind-the-scenes action and the technical writers who are going to be working with NumPy and SciPy during Google Season of Docs. </p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/2400/1%2AxB6HjNiuql2tqYoHI5eEtQ.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a href="https://www.pexels.com/@pixabay?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pixabay </a>from&nbsp;<a href="https://www.pexels.com/photo/action-adult-dance-dancer-270837/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pexels</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><strong>It’s time to meet Maja!</strong></p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Maja has done some knockout research, which <a rel="noreferrer noopener" href="https://lmu-munich.academia.edu/MajaGw%C3%B3%C5%BAd%C5%BA" target="_blank">you can find here</a>. She has not only had significant experience with SciPy, but she’s well aware of what a difference great documentation and guides can make. Because it’s so easy for technical writers to get lost in the background of a project, I wanted to take this space to let you know what she’s working on in her own words.</p>



<p class="wp-block-paragraph">If you aren’t familiar with what we’re doing with NumPy and SciPy through Google Season of Docs, you can read all about it here:</p>



<p class="wp-block-paragraph"><a href="https://towardsdatascience.com/what-do-you-want-to-see-in-the-numpy-docs-de73efb80375"><strong>What do You Want to See in the NumPy Docs?</strong><br><em>Behind the scenes at NumPy and SciPy with Google Season of Docs</em></a></p>



<p class="wp-block-paragraph">While I’m building a new beginner-oriented technical documentation section with NumPy, Maja is working with SciPy to restructure its existing documentation.</p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">Meet Maja Gwózdz!</h3>



<p class="wp-block-paragraph"><em>I made a couple of very minor tweaks, but here’s what Maja had to say about herself and her plans for SciPy and Season of Docs:</em></p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">About <a rel="noreferrer noopener" href="https://lmu-munich.academia.edu/MajaGw%C3%B3%C5%BAd%C5%BA" target="_blank">Maja</a></h3>



<p class="wp-block-paragraph">I completed a BA in English Studies with distinction (Jagiellonian University, Poland) and then obtained an MPhil in Theoretical and Applied Linguistics with distinction from the University of Cambridge. I then decided to pursue a BSc in Computer Science (at the Ludwig Maximilian University in Munich) and take additional courses in mathematics (so far, I have completed the following extension courses either from UC Berkeley or the University of Illinois: Elementary Number Theory, Calculus II, Precalculus, Python Programming). Other relevant technical courses I have taken so far are: Real Analysis, Linear Algebra, Introduction to Programming, Algorithms and Datastructures, Discrete Mathematics and Logic, Introduction to Functional Programming, Introduction to Artificial Intelligence, Logic, Computer Architecture. As regards machine learn- ing, I have a working knowledge of statistics, the Multilayer Perceptron Classifier (especially its application to automatic speech recognition), and other popular Artificial Neural Networks.</p>



<p class="wp-block-paragraph">While I am not a technical writer in the strictly professional sense of the word, I am familiar with Sphinx and I have performed the tasks of a technical writer on several occasions. For instance, I completed an internship at Lufthansa CityLine, where I was responsible for running penetration tests and writing a technical report on network vulnerabilities. I was also responsible for designing a JIRA / Confluence workflow and preparing a basic guide for internal users. I was a student at GSoC 2018 (it was a project on corpus linguistics involving, among other tasks, the creation of annotation guidelines) and I am currently a mentor at the same organisation (CLiPS, the University of Antwerp).</p>



<p class="wp-block-paragraph">I am passionate about clear and logical communication of technical matters and I believe that this project suits my background perfectly because I have the required linguistic tools to convey complex ideas plus the necessary mathematical / computer science knowledge to comprehend the subject (or, at least, know how to ask the right questions about the given matter).</p>



<p class="wp-block-paragraph">I pay great attention to detail but, at the same time, try not to lose sight of the big picture. Whenever I notice that I spend too much time on a less urgent task, I quickly move on to the important phases, so as to meet the deadline (time permitting, I take care of the less urgent issues, of course). Getting stuck is the natural part of any creative process and it is, indeed, valuable but if it becomes a true obstacle I never hesitate to ask for help. This approach has worked very well in my previous projects and I intend to apply it to subsequent endeavours. In the interaction with supervisors and team colleagues, I particularly like constructive criticism and frequent feedback. While support and positive comments are undoubtedly important, I have never made significant progress based on praise alone. I enjoy challenging tasks and, as regards my approach to solving real-life software problems, I believe that actively listening to community members and global users is THE way to create excellent software. It would be an honour to work on SciPy.</p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">Motivation</h3>



<p class="wp-block-paragraph">I intend to work on the refactoring of the existing documentation, so that it would be easily accessible by users with different needs. It goes without saying that a researcher is most likely interested in advanced and subtle features, whereas a user without prior expertise appreciates step-by-step guides and diagrams.</p>



<p class="wp-block-paragraph">I am interested in pursuing this project for personal and professional reasons: first of all, I would like to contribute significantly to SciPy because my own research has greatly benefited from it and secondly, I encounter insufficient (or lacking) documentation all too often in other software and always wonder how much faster (if it all!) users could learn how to use the code had they been provided with a thorough guide.</p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">Goals</h3>



<p class="wp-block-paragraph">I aim to improve the existing SciPy documentation both content- and graphic-wise. The most important feature of my approach to this problem is the deployment and analysis of the user survey, that is to say, a concise survey conducted online enabling various users to voice their needs regarding the documentation. I strongly believe that their opinions should be the source of inspiration (how else can we create more user-friendly documentation?).</p>



<p class="wp-block-paragraph">As regards the realisation of the project itself, the first phase will involve designing and analysing the user survey, as well as tackling several stylistic issues I have noticed in the current documentation. For instance, lack of consistency (example: 2-dimensional arrays occurring alongside two-dimensional arrays), convoluted sentences that ought to be rewritten, or the lack of alphabetical order in certain subpages. The second phase will focus on the introduction of graphical guides containing hyperlinks to the relevant topics (based on the survey results and other community requests). In the long run, I wish to achieve a satisfactory documentation tailored to different kinds of users. Moreover, I will attempt to render the tutorials more consistent both linguistically and structurally. Last but not least, I aim to write new tutorials (based on the current community needs).</p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">User survey</h3>



<p class="wp-block-paragraph">As regards the user survey, I propose to use Google Forms for several reasons. First of all, Google Forms is free and offers unlimited functionality (in terms of the number of respondents, questions, etc.), it has an appealing visual form, the most useful survey options (for instance, the customisable linear scale, checkboxes, and multiple choice), and, most importantly, the results can be easily exported for the purposes of statistical analysis. Based on online research, it appears that Google Forms is, at least for now, the best free tool for conducting surveys. On a less serious note, it would be a nice gesture to use a Google product in a Google-run initiative.</p>



<p class="wp-block-paragraph">I have created a preliminary survey with sample questions (it can be accessed here). A reasonable number of questions in the final version ought to be between ten and fifteen. In order to obtain concrete results, I suggest that we predominantly use multiple-choice questions, a linear scale, and a few checkboxes. The linear scale should not resemble a full spectrum, though (it only causes confusion and the results are likely to suffer from high dispersion). There ought to be at maximum two open-ended questions, otherwise, the results will be highly dispersed and not helpful at all. I reckon that even a very high number of responses would not be problematic due to the fact that the data can be easily exported and analysed automatically with statistical software. Assuming that the number of responses is, indeed, very high, the analysis of open-ended questions could be a little time-consuming but I presume that it will not be overwhelming. After all, an average user is not likely to write an essay about the state of the documentation. In the worst-case scenario, some answers can be simply stored for future analysis.</p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">Graphical guides</h3>



<p class="wp-block-paragraph">My vision of the graphical guides (intended to serve as navigational tools) is based on a popular premise that (most) humans are better at processing straightforward visual structures rather than purely text-based information. Moreover, a thematically-oriented diagram with lines connecting similar topics of interest is, undoubtedly, a highly valuable asset for less experienced users (and not only).</p>



<p class="wp-block-paragraph">As regards the implementation details, I propose to use the TikZ package. First and foremost, it is a powerful tool and does not seem to be at risk of being deprecated soon. It also offers high-quality output, has really solid documentation, and is a frequent topic on TeX StackExchange and other mainstream forums. Most importantly, the integration of a TikZ file (more precisely, the numerous hyperlinks therein) with HTML documentation does not appear to pose significant problems due to the existence of various packages and fixes for embedding a TikZ picture in HTML (for instance, TeX4ht).</p>



<p class="wp-block-paragraph">The question of future maintenance of the guides within SciPy can be easily solved by using, say, Overleaf (facilitates collaboration plus offers an instant preview) and predefined templates that I will supply. Basically, the graphical guides are not likely to differ hugely from one another. The structure, colour palette, and shapes are, more or less, going to be invariant, therefore subsequent re-shaping and further customisation will not be an issue. A rough sketch of such a guide (observe the counter-clockwise alphabetical order in the subcategories) is provided on the next page1. The complete diagram will, of course, contain hyperlinks to the respective sections in the documentation.</p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><em>Featured Photo by <a rel="noreferrer noopener" href="https://www.pexels.com/@philipp-deus-1172079?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank">Philipp Deus </a>from <a rel="noreferrer noopener" href="https://www.pexels.com/photo/selective-focus-photography-of-jelly-fish-2234000/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank">Pexels</a></em></p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/scipy-meet-maja-gwozdz/">NumPy and SciPy and Google Season of Docs, Oh My: Meet Maja Gwózdz</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/scipy-meet-maja-gwozdz/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1020</post-id>	</item>
		<item>
		<title>NumPy and SciPy and Google Season of Docs, Oh My: Meet Christina Lee</title>
		<link>https://contentsimplicity.com/numpy-scipy-writer-christina-lee/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=numpy-scipy-writer-christina-lee&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=numpy-scipy-writer-christina-lee</link>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Sat, 21 Sep 2019 12:15:00 +0000</pubDate>
				<category><![CDATA[NumPY]]></category>
		<category><![CDATA[SciPy]]></category>
		<category><![CDATA[Season of Docs]]></category>
		<category><![CDATA[writing]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=983</guid>

					<description><![CDATA[<p>Welcome to Season of Docs! You’re going behind the scenes to meet the people and learn about the work being done right now at NumPy and SciPy.</p>
<p>The post <a href="https://contentsimplicity.com/numpy-scipy-writer-christina-lee/">NumPy and SciPy and Google Season of Docs, Oh My: Meet Christina Lee</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">Learn more about the technical writers paired with NumPy and SciPy during Google Season of&nbsp;Docs</h3>



<p class="wp-block-paragraph"></p>



<hr class="wp-block-separator"/>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><em>From September through November, our little corner of the open-source world is going to involve technical documentation updates at NumPy and SciPy!</em></p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">Welcome to NumPy and SciPy!!!</h3>



<hr class="wp-block-separator"/>



<p class="wp-block-paragraph">You’re going behind the scenes to meet the people and learn about some of the work we’re doing right now at NumPy and SciPy.</p>



<p class="wp-block-paragraph">A couple of weeks ago, I told you I would let you know more about the technical writers who are going to be working with NumPy and SciPy during Google Season of Docs. It’s time to meet Christina Lee!</p>



<p class="wp-block-paragraph">If you aren’t familiar with the project, you can read all about it here:</p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" aria-label="What do You Want to See in the NumPy Docs?
Behind the scenes at NumPy and SciPy with Google Season of Docs (opens in a new tab)" href="https://contentsimplicity.com/numpy-scipy-google-season-of-docs/" target="_blank"><strong>What do You Want to See in the NumPy Docs?</strong><br><em>Behind the scenes at NumPy and SciPy with Google Season of Docs</em></a></p>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading">What is Google Season of Docs?</h3>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Google did an amazing thing by creating <a href="https://developers.google.com/season-of-docs/" rel="noreferrer noopener" target="_blank">Season of Docs</a>. It built real opportunities for technical writers to collaborate with open source organizations.</p>



<p class="wp-block-paragraph">Season of Docs is a three-month mentoring program that pairs technical writers with open source organizations. Writers have the opportunity to work with well-known and highly-regarded organizations. Open source organizations (who often don’t have a budget for technical writers) have the opportunity to work with experienced technical writers to improve and expand their existing documentation.</p>



<p class="wp-block-paragraph">It’s pretty incredible.</p>



<p class="wp-block-paragraph">I’m working with NumPy! Just to make things even cooler, there’s so much overlap between the NumPy and SciPy projects, that we get to meet frequently and collaborate with each other. That means that I get to update all of you with the changes we’re making!</p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AUZp8AChZAogOvlpoU0WBJQ.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a href="https://www.pexels.com/@pixabay?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pixabay </a>from&nbsp;<a href="https://www.pexels.com/photo/animal-black-and-white-cute-funny-164703/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pexels</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Since I hadn’t yet learned a lot about Christina when I wrote the last post, it seemed like a good idea to use today’s post to introduce her to you.</p>



<p class="wp-block-paragraph"><em>I made a couple of very minor tweaks, but here’s what Christina had to say about herself and her plans:</em></p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading">Meet Christina Lee!</h4>



<p class="wp-block-paragraph">Overall, I want to improve SciPy.org and <a href="http://docs.scipy.org/" rel="noreferrer noopener" target="_blank">docs.scipy.org</a>’s design and structure.</p>



<p class="wp-block-paragraph">I’m returning to Python after being a Julia programmer, so I might be helpful for newbie proofing Python code. I write Julia Jupyter notebooks on a variety of physics and numerics topics, available at <a href="http://albi3ro.github.io/M4" rel="noreferrer noopener" target="_blank">albi3ro.github.io/M4</a>&nbsp;. At JuliaCon, I gave a lightning talk on “Teaching with Code”, written up at <a href="http://albi3ro.github.io/M4/Teaching_With_Code.html" rel="noreferrer noopener" target="_blank">http://albi3ro.github.io/M4/Teaching_With_Code.html</a>, which summarizes my code teaching ideals.</p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading">From her proposal:</h4>



<p class="wp-block-paragraph">Work on both the SciPy website and docs.scipy needs to start with a structural and graphical overhaul. At each page, I cannot instinctively tell how to navigate to what I want, what the purpose of the page is, or what the page wants me to feel and do. While Sphinx may be the tool of choice for documentation, we can pull away from Sphinx for both the main website (<a href="http://scipy.org/" rel="noreferrer noopener" target="_blank">scipy.org</a>) and the tutorials in favor of a more versatile web layout. Designing two distinct layouts for <a href="http://scipy.org/" rel="noreferrer noopener" target="_blank">scipy.org</a> and <a href="http://docs.scipy.org/" rel="noreferrer noopener" target="_blank">docs.scipy.org</a> will help clear up the confusion between the ecosystem and the package.</p>



<p class="wp-block-paragraph">While reworking the container for the content would form a good portion of the GSoD project, I would also work on the content on the website. The content breaks down into tutorial pages and surrounding pages. For the tutorials, I would highlight the basic usage front and center to get users up and going rapidly. Then I would want to focus on explaining what the numerical method accomplishes and what is possible beyond basic usage. Tutorials already exist, but editing could make them better. Reworking the content on the main pages would help with the navigational and structure problems discussed above.</p>



<hr class="wp-block-separator"/>



<p class="wp-block-paragraph">If it sounds exciting to work with organizations like NumPy and SciPy, just do it! Don’t wait! People get really overwhelmed at the idea of working on the code for an open-source organization. But there’s more going on than just the code. You can’t imagine how helpful it can be to have someone step in on the documentation side.</p>



<p class="wp-block-paragraph">If you want to get involved with open-source projects, get involved. If you love to write (or you love to work on the writing other people have done), get in there and work your magic! It’s up to everyone to make the tech world an even more amazing place than it already is.</p>



<p class="wp-block-paragraph">If you’re into data science, machine learning, artificial intelligence, or technology in general, then you’ve seen some documentation. If you’re having trouble understanding some of it, don’t sit back and wish things were different. Get in there and help.</p>



<p class="wp-block-paragraph">Make a difference!</p>



<p class="wp-block-paragraph">You might get to learn something new. You might even get to meet some incredibly cool people!</p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AxlrvXzFV1vNiQ3fk6xR_2g.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a href="https://www.pexels.com/@timmossholder?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Tim Mossholder </a>from&nbsp;<a href="https://www.pexels.com/photo/white-and-blue-come-on-in-we-ere-open-signage-2432221/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pexels</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">If you want to contribute to open-source organizations but don&#8217;t know how to use GitHub, check out this article:</p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"><a href="https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/"><strong>Getting started with Git and GitHub: the complete beginner’s guide</strong><br><em>Git and GitHub basics for the curious and completely confused</em></a></p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Thanks for reading! As always, if you do anything cool with this information, let everyone know about it in the comments below!</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/numpy-scipy-writer-christina-lee/">NumPy and SciPy and Google Season of Docs, Oh My: Meet Christina Lee</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">983</post-id>	</item>
		<item>
		<title>What do You Want to See in the NumPy Docs?</title>
		<link>https://contentsimplicity.com/numpy-scipy-google-season-of-docs/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=numpy-scipy-google-season-of-docs&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=numpy-scipy-google-season-of-docs</link>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Thu, 19 Sep 2019 00:12:55 +0000</pubDate>
				<category><![CDATA[NumPY]]></category>
		<category><![CDATA[SciPy]]></category>
		<category><![CDATA[Season of Docs]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[documentation]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[technical documentation]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=991</guid>

					<description><![CDATA[<p>Behind the scenes at NumPy and SciPy with Google Season of Docs Season of Docs has begun!!! What is Google Season of Docs? Google did an amazing thing by creating Season of Docs. It built real opportunities for technical writers to collaborate with open source organizations. Season of Docs is a three-month mentoring program. It pairs [&#8230;]</p>
<p>The post <a href="https://contentsimplicity.com/numpy-scipy-google-season-of-docs/">What do You Want to See in the NumPy Docs?</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">Behind the scenes at NumPy and SciPy with Google Season of Docs</h2>



<p class="wp-block-paragraph"></p>



<p style="font-size:36px;text-align:center" class="has-custom-size wp-block-paragraph"><a href="https://developers.google.com/season-of-docs/" rel="noreferrer noopener" target="_blank">Season of Docs</a> has begun!!!</p>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">What is Google Season of Docs?</h2>



<p class="wp-block-paragraph">Google did an amazing thing by creating Season of Docs. It built real opportunities for technical writers to collaborate with open source organizations.</p>



<p class="wp-block-paragraph">Season of Docs is a three-month mentoring program. It pairs technical writers with open source organizations. Writers have the opportunity to work with well-known and highly-regarded organizations. Open source organizations (who often don’t have a budget for technical writers) have the opportunity to work with experienced technical writers to improve and expand their existing documentation.</p>



<p class="wp-block-paragraph">It’s pretty incredible.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>The goal of Season of Docs is to provide a framework for technical writers and open source projects to work together towards the common goal of improving an open source project’s documentation. For technical writers who are new to open source, the program provides an opportunity to gain experience in contributing to open source projects. For technical writers who’re already working in open source, the program provides a potentially new way of working together. Season of Docs also gives open source projects an opportunity to engage more of the technical writing community.</p><p>During the program, technical writers spend a few months working closely with an open source community. They bring their technical writing expertise to the project’s documentation, and at the same time learn about the open source project and new technologies.</p><p>The open source projects work with the technical writers to improve the project’s documentation and processes. Together they may choose to build a new documentation set, or redesign the existing docs, or improve and document the open source community’s contribution procedures and onboarding experience.</p><p>Together, we raise public awareness of open source docs, of technical writing, and of how we can work together to the benefit of the global open source community.</p><cite>~<a rel="noreferrer noopener" href="https://developers.google.com/season-of-docs/docs/" target="_blank">Introduction to Google Season of Docs</a></cite></blockquote>



<p class="wp-block-paragraph">It’s a win-win!</p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AhpKny4cVB92ZkKG2BDPWAw.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a rel="noreferrer noopener" href="https://www.pexels.com/@it-s-me-marrie-1418249?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank">It’s me, Marrie </a>from <a rel="noreferrer noopener" href="https://www.pexels.com/photo/pembroke-welsh-corgi-sticking-its-tongue-out-2737392/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank">Pexels</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">What is NumPy?</h2>



<p class="wp-block-paragraph">At it’s most basic level, <a href="https://www.numpy.org/" rel="noreferrer noopener" target="_blank">NumPy</a> is numeric, or numerical (<strong>Num</strong>) Python (<strong>Py</strong>).</p>



<p class="wp-block-paragraph">From the official documentation:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>“NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.”</p></blockquote>



<p class="wp-block-paragraph">It’s a hugely important open source Python library. It’s the core library for scientific computing in Python. It’s useful in data science, machine learning, deep learning, artificial intelligence, computer vision, science, engineering, and more. It adds support for large, multi-dimensional arrays and matrices and a huge collection of high-level mathematical functions that can operate on the arrays.</p>



<p class="wp-block-paragraph">The ancestor of NumPy (Numeric) was originally created by <a href="https://en.wikipedia.org/wiki/Jim_Hugunin" rel="noreferrer noopener" target="_blank">Jim Hugunin</a>. By 2000, interest in creating a complete environment for scientific and technical computing was growing. In 2001, Travis Oliphant, Eric Jones, and Pearu Peterson merged code they had written and called the resulting package SciPy. In 2005, <a href="https://en.wikipedia.org/wiki/Travis_Oliphant" rel="noreferrer noopener" target="_blank">Travis Oliphant</a> created NumPy. He did this by incorporating features of Numarray into Numeric with tons of modifications. In early 2005, he wanted to unify the community around a single array package. As a result, he released NumPy 1.0 in 2006. This project was part of <a href="https://en.wikipedia.org/wiki/SciPy" rel="noreferrer noopener" target="_blank">SciPy</a>. To avoid installing the large SciPy package just to get an array object, this new package was separated and called NumPy.</p>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">What’s SciPy?</h2>



<p class="wp-block-paragraph">It’s scientific (<strong>Sci</strong>) Python (<strong>Py)</strong>! <a href="https://www.scipy.org/" rel="noreferrer noopener" target="_blank">SciPy</a> is a free and open source Python library. It’s used for scientific computing and technical computing. It contains modules for <a href="https://en.wikipedia.org/wiki/Optimization_%28mathematics%29" rel="noreferrer noopener" target="_blank">optimization</a>, <a href="https://en.wikipedia.org/wiki/Linear_algebra" rel="noreferrer noopener" target="_blank">linear algebra</a>, <a href="https://en.wikipedia.org/wiki/Integral" rel="noreferrer noopener" target="_blank">integration</a>, <a href="https://en.wikipedia.org/wiki/Interpolation" rel="noreferrer noopener" target="_blank">interpolation</a>, <a href="https://en.wikipedia.org/wiki/Special_functions" rel="noreferrer noopener" target="_blank">special functions</a>, <a href="https://en.wikipedia.org/wiki/Fast_Fourier_transform" rel="noreferrer noopener" target="_blank">FFT</a>, <a href="https://en.wikipedia.org/wiki/Signal_processing" rel="noreferrer noopener" target="_blank">signal</a> and <a href="https://en.wikipedia.org/wiki/Image_processing" rel="noreferrer noopener" target="_blank">image processing</a>, <a href="https://en.wikipedia.org/wiki/Ordinary_differential_equation" rel="noreferrer noopener" target="_blank">ODE</a> solvers and other tasks common in science and engineering. SciPy uses NumPy arrays as the basic data structure. It has modules for various commonly used tasks in scientific programming. These tasks include integration (calculus), ordinary differential equation solving, and signal processing.</p>



<p class="wp-block-paragraph">SciPy builds on the <a href="https://en.wikipedia.org/wiki/NumPy" rel="noreferrer noopener" target="_blank">NumPy</a> array object. It’s part of the NumPy stack. The stack includes tools like <a href="https://en.wikipedia.org/wiki/Matplotlib" rel="noreferrer noopener" target="_blank">Matplotlib</a>, <a href="https://en.wikipedia.org/wiki/Pandas_%28software%29" rel="noreferrer noopener" target="_blank">Pandas</a>, and <a href="https://en.wikipedia.org/wiki/SymPy" rel="noreferrer noopener" target="_blank">SymPy</a>, and an expanding set of scientific computing libraries. Its users come from all fields of science, engineering and beyond. Python has one of the largest, if not <em>the</em> largest, scientific user communities. Similar communities are R, Julia and Matlab.</p>



<p class="wp-block-paragraph"><em>Still with me?</em></p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2ACdjLGSmm0KUNQAZ8C_rFzA.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a href="https://www.pexels.com/@passerina-523993?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Passerina </a>from&nbsp;<a href="https://www.pexels.com/photo/tilt-shift-lens-of-yellow-napped-amazon-1257855/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pexels</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">The Process</h2>



<p class="wp-block-paragraph">Google announced Season of Docs in March 2019. In April, open source organizations had the opportunity to apply to be a part of the program. Google announced the selected organizations on April 30. Technical writers were able to look over the list of 45 organizations and choose projects that interest them. They could submit up to three project proposals. From May 29-June 28, technical writer applications were open! After the application deadline was over, each organization selected the technical writing projects that they were interested in mentoring.</p>



<p class="wp-block-paragraph"><a href="https://opensource.googleblog.com/2019/08/season-of-docs-announces-technical.html" rel="noreferrer noopener" target="_blank">On August 6, Google announced the accepted writing projects!</a></p>



<p class="wp-block-paragraph">The program received more than 700 technical writing project proposals from nearly 450 technical writers. Each organization was able to select one technical writer for an approved project. The NumPy/SciPy team, however, decided to go above and beyond by securing funding for an additional three writers outside of Season of Docs. The team believes so strongly in moving their documentation forward that they found additional funding. This allowed them to include three more writers under the same conditions as Season of Docs.</p>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">Where did the funding come from?</h2>



<p class="wp-block-paragraph">NumPy received two grants that are kind of a package deal (you can read about them <a href="https://www.moore.org/grant-detail?grantId=GBMF5447" rel="noreferrer noopener" target="_blank">here</a> and <a href="https://bids.berkeley.edu/news/bids-receives-sloan-foundation-grant-contribute-numpy-development" rel="noreferrer noopener" target="_blank">here</a>). Funds were awarded by the Moore and Sloan foundations for $1.3M to the Berkeley Institute of Data Science (BIDS) to support the development of NumPy. The funding period runs from April 2018 to Oct 2020. (<a href="https://mentat.za.net/" rel="noreferrer noopener" target="_blank">Stéfan van der Walt</a>, a NumPy Steering Council member, agreed to provide the funds from that grant.)</p>



<p class="wp-block-paragraph"><a href="https://rgommers.github.io/" rel="noreferrer noopener" target="_blank">Ralf Gommers</a>, one of the core programmers behind NumPy and SciPy and the Director of <a href="https://www.quansight.com/" rel="noreferrer noopener" target="_blank">Quansight Labs</a>, is the point of contact for both organizations. Ralf is an incredible person, and he had this to say about Season of Docs:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>“When I first saw the Season of Docs announcement, I loved the idea of the program — working with a tech writer would be both an interesting new experience for me personally, and potentially massively beneficial to NumPy and SciPy. So I spent a lot of effort on both writing a very engaging ideas page, and then following up with writers that showed interest. I probably had ~10 video calls, and many more email threads.</p><p>Then, it turned out that there was a lot of interest, and the quality of applicants and proposals was really high. I started thinking about how to not only get one or two 3-month projects running, but how to engage these writers in a way that would make them enjoy the experience enough to stay around after the project. One thing that came to mind was that people like working with like-minded others. However, we don’t yet have technical writers — adding one to NumPy and one to SciPy may not be enough. So I decided to start building a documentation team. The ideas and people were there, so next what’s needed is funding.</p><p>NumPy has a significant active grant, so I discussed the possibility of using some of that grant funding for the extra Season of Docs projects with Stéfan. Stéfan is awesome, and he also sees the value of both the proposed projects and of building a team of writers. So he agreed to reserve some funds for this purpose. So here we are today — excited to get started!”</p><cite>~Ralf Gommers</cite></blockquote>



<h2 class="wp-block-heading">Who are the&nbsp;writers?</h2>



<p class="wp-block-paragraph">The writers selected for the NumPy/SciPy documentation projects are amazing, and you need to know who they are!</p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading"><a rel="noreferrer noopener" href="https://lmu-munich.academia.edu/MajaGwóźdź" target="_blank">Maja Gwozdz</a></h4>



<p class="wp-block-paragraph">The official technical writer selected by SciPy during Season of Docs is Maja Gwozdz. Her project proposal is called “User-oriented documentation and thorough restructuring.” You can <a href="https://developers.google.com/season-of-docs/docs/participants/" rel="noreferrer noopener" target="_blank">read all about it here</a>, but essentially, Maja intends to work on the refactoring of the existing documentation, so that it would be easily accessible by users with different needs.</p>



<p class="wp-block-paragraph">Maja has done some knockout research, which <a href="https://lmu-munich.academia.edu/MajaGwóźdź" rel="noreferrer noopener" target="_blank">you can find here</a>. She has not only had significant experience with SciPy, but she’s well aware of what a difference great documentation and guides can make.</p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading"><a rel="noreferrer noopener" href="https://medium.com/@annebonner" target="_blank">Anne Bonner</a></h4>



<p class="wp-block-paragraph">Yours truly (yay!) was the official selection for NumPy, with the project proposal, “Making ‘The Basics’ a Little More Basic: Improving the Introductory NumPy Sections.” Since there’s nothing that makes me happier than helping beginners understand complex information and technologies, NumPy is the perfect challenge!</p>



<p class="wp-block-paragraph">I’m excited to dig into the introductory NumPy materials to create something more accessible for people with little or no experience. NumPy is in such an interesting position: it’s incredibly complex, but it’s also one of the most important libraries for beginners who are interested in working with data. I’ll be creating beginner-level documentation of basic concepts in NumPy that can function as a stepping stone for people who want to <em>use</em> NumPy, not necessarily study it.</p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading"><a rel="noreferrer noopener" href="https://medium.com/@Shekharrajak" target="_blank">Shekhar Rajak</a></h4>



<p class="wp-block-paragraph"><a href="https://medium.com/@Shekharrajak" target="_blank" rel="noreferrer noopener">Shekhar Rajak</a> was selected for “Numpy.org redesign and high-level documentation restructuring for end-user focus.” His goals for the project include:</p>



<ul class="wp-block-list"><li>Designing and developing better UI for <a rel="noreferrer noopener" href="http://www.numpy.org" target="_blank">www.numpy.org</a></li><li>Enhancing and modifying the contents of <a rel="noreferrer noopener" href="http://www.numpy.org" target="_blank">www.numpy.org</a>: NumPy User Guide, NumPy Benchmarking, F2Py Guide, NumPy Developer Guide, Building and Extending the Documentation, NumPy Reference, About NumPy, Reporting bugs and all other related to Development pages.</li><li>Adding contents about when to use NumPy and when to use XND, Dask array Python libraries, which provides similar APIs.</li><li>Preserving the Python API documentation.</li></ul>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading">Brandon David</h4>



<p class="wp-block-paragraph">Brandon David was selected for his project “Improve the documentation of scipy.stats.” Brandon plans to fill out missing functions as well as add examples and internal links. His goal is to clear up ambiguity and work through issues on GitHub.</p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading">Christina Lee</h4>



<p class="wp-block-paragraph">Christina Lee was selected for her proposal, “SciPy documentation: Design, Usability and Content.” She is a recent addition, and I’m looking forward to sharing her work with you soon!</p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading"><a rel="noreferrer noopener" href="https://medium.com/@harivallabharangarajan" target="_blank">Harivallabha Rangarajan</a></h4>



<p class="wp-block-paragraph"><a href="https://medium.com/@harivallabharangarajan" target="_blank" rel="noreferrer noopener">Harivallabha Rangarajan</a> is planning to contribute to the documentation and complement the work of the writers selected for Season of Docs in any way he can. He’s particularly interested in writing end-to-end tutorials for the scipy.stats module. He writes that “having more comprehensive tutorials will help users get a better idea of how and where the available methods may be used in the pipeline.”</p>



<p class="wp-block-paragraph" style="text-align:center"><strong>Welcome to Season of Docs!!!</strong></p>



<p class="wp-block-paragraph">It’s incredible to be involved in the inner workings of NumPy and SciPy. So far, we’ve been joining meetings with the team, getting to know the core players, and learning the workflow. I can’t wait to keep you guys updated with our projects as they develop!</p>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AsU0sCx8GPNzmbt0gIZWJwA.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by <a href="https://www.pexels.com/@psco?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pineapple Supply Co. </a>from&nbsp;<a href="https://www.pexels.com/photo/photo-of-pineapple-wearing-black-aviator-style-sunglasses-and-party-hat-1071878/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" rel="noreferrer noopener" target="_blank">Pexels</a></figcaption></figure>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">Get involved!</h2>



<p class="wp-block-paragraph">Now that you know the major players on the writing side, don’t be afraid to reach out and let us know if there’s information you want to see in the official documentation! Who knows, we might just be able to give you what you want to see.</p>



<p class="wp-block-paragraph">If the idea of getting involved with open-source organizations interests you, get in there and start sharing! Don’t wait for an invitation. Start contributing now! It’s up to everyone to make the tech world an even more amazing place than it already is.</p>



<p class="wp-block-paragraph">If you’re interested in contributing to open-source organizations but have no idea how to get started with GitHub, you might want to check out this article:</p>



<p class="wp-block-paragraph"><a href="https://contentsimplicity.com/getting-started-with-git-and-github-the-complete-beginners-guide/" target="_blank" rel="noreferrer noopener" aria-label="Getting started with Git and GitHub: the complete beginner’s guide
Git and GitHub basics for the curious and completely confused (opens in a new tab)"><strong>Getting started with Git and GitHub: the complete beginner’s guide</strong><br><em>Git and GitHub basics for the curious and completely confused</em></a></p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Thanks for reading! As always, if you do anything cool with this information, let everyone know about it in the comments below!</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/numpy-scipy-google-season-of-docs/">What do You Want to See in the NumPy Docs?</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">991</post-id>	</item>
		<item>
		<title>How to Effortlessly Connect OBIEE to Tableau 2019.2</title>
		<link>https://contentsimplicity.com/how-to-connect-obiee-to-tableau/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-connect-obiee-to-tableau&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-connect-obiee-to-tableau</link>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Mon, 09 Sep 2019 05:00:00 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://contentsimplicity.com/?p=415</guid>

					<description><![CDATA[<p>Are you frustrated with how difficult it is to visualize your data the way you want to? Do you want your OBIEE data to easily connect with Tableau? </p>
<p>The post <a href="https://contentsimplicity.com/how-to-connect-obiee-to-tableau/">How to Effortlessly Connect OBIEE to Tableau 2019.2</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h3 class="wp-block-heading"></h3>



<p class="wp-block-paragraph"><em>Image by <a href="https://pixabay.com/users/lextotan-7839095/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=3135551">Evren Ozdemir</a> from <a href="https://pixabay.com/?utm_source=link-attribution&amp;utm_medium=referral&amp;utm_campaign=image&amp;utm_content=3135551">Pixabay</a></em></p>



<p class="wp-block-paragraph">Are you frustrated with how difficult it is to visualize your OBIEE data the way you want to? Do you wish that your OBIEE data could simply and securely connect with Tableau? </p>



<p class="wp-block-paragraph">It can! BI Connector allows you to securely access and use your OBIEE and Taleo data right in Tableau, PowerBI, and Qlik.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A-eIDv4EaEeFQjnLI14M84A.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">If you want to use visualizations to communicate the results of your analysis, you probably want to work with modern, easy-to-use visual analytics tools like Tableau. But there’s no simple way to do this with your OBIEE data! Making it happen manually means wasting an average of 4–5 hours per week creating and using exports and imports of OBIEE data into a visualization tool. That can translate to tens of thousands of dollars wasted. With BI Connector, you can easily connect to OBIEE subject areas and reports in minutes. You simply log in with your OBIEE credentials, reusing your existing OBIEE business logic. You don’t have to make any changes to OBIEE.</p>



<p class="wp-block-paragraph">You save time and money.</p>



<p class="wp-block-paragraph">BI Connector is perfect for anyone interested in machine learning, data analysis, data visualization, business analysis, data science, and predictor analysis. It’s easy enough for beginners to use and it offers benefits that the most advanced power users will enjoy. It’s the number one BI integration solution used by enterprise customers. It’s simple, secure, and efficient. Setup takes less than five minutes. You can run direct query-based live connections to both subject areas and reports and immediately use your results to create gorgeous, responsive, and intuitive visualizations. It allows you to make faster decisions and avoid common errors.</p>



<p class="wp-block-paragraph">It will save you an incredible amount of time and effort.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AakVOajSkjZ_ICGqkpGmaRA.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by nickgesell via&nbsp;<a href="http://pixabay.com/" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<p class="wp-block-paragraph">BI Connector is hands-down the easiest way to connect the capabilities of OBIEE with the intuitive visualizations of Tableau. It allows you to take the power and security of OBIEE and effortlessly combine it with everything your favorite visualization tool has to offer.</p>



<h4 class="wp-block-heading"><a href="http://www.oracle.com/us/solutions/business-analytics/business-intelligence/enterprise-edition/overview/index.html" rel="noreferrer noopener" target="_blank">OBIEE</a></h4>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2Af7KgWkU6pXQcu2JwRHLH5Q.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Image labeled for reuse via&nbsp;<a href="https://commons.wikimedia.org/wiki/File:Logo_oracle.jpg" rel="noreferrer noopener" target="_blank">Wikimedia Commons</a></figcaption></figure>



<p class="wp-block-paragraph">If you’re serious about data, there’s a very good chance that you’re already using&nbsp;<a href="http://www.oracle.com/us/solutions/business-analytics/business-intelligence/enterprise-edition/overview/index.html" rel="noreferrer noopener" target="_blank">OBIEE</a>. It’s amazing for data reporting and intelligence. It can hold a huge volume of data and it’s perfect for medium and large enterprises. It also handles complex structures extremely well. Plus Oracle has developed pre-defined BI solutions that are available in OBIEE. When you create a BI solution in OBIEE, that solution is implemented immediately. It offers interactive dashboards and reporting, actionable intelligence, proactive detection and alerts, Microsoft Office integration, and a lot more.</p>



<p class="wp-block-paragraph">That being said, there are simply not as many visualization options available in OBIEE as there are in Tableau. The options that are available are not as user-friendly as the ones in Tableau. OBIEE also has a limited ability to work with other tools and often requires the purchase of an extra license to use them.&nbsp;</p>



<p class="wp-block-paragraph">OBIEE also requires a significant amount of education to use properly and it’s not as easy to connect your OBIEE data to Tableau as it could be. You can do it, of course, but solutions like creating Excel exports or SQL scripts wastes time. Wasted time is wasted money. (It’s also worth noting that exporting your data and then importing it puts your data at risk of unauthorized access.)</p>



<h4 class="wp-block-heading"><a href="https://www.tableau.com/" rel="noreferrer noopener" target="_blank">Tableau</a></h4>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AwKUX7nyW3UrJnixzEa0RXQ.png?w=1080&#038;ssl=1" alt=""/><figcaption>Image labeled for reuse via&nbsp;<a href="https://en.wikipedia.org/wiki/File:Tableau_Logo.png" rel="noreferrer noopener" target="_blank">Wikipedia</a></figcaption></figure>



<p class="wp-block-paragraph"><a href="https://www.tableau.com/" rel="noreferrer noopener" target="_blank">Tableau</a>, on the other hand, offers gorgeous and easy-to-create visualizations. It’s well-suited for small and medium enterprises. It allows for the use of a number of different tools. It’s intuitive and user-friendly, offering simple drag-and-drop responsive charts, one-click formulas, filters, and a lot more. There are also some fun new features in 2019.2, including vector-based maps!&nbsp;<a href="https://www.tableau.com/products/new-features" rel="noreferrer noopener" target="_blank">You can check out the new features here</a>. But while Tableau can handle a lot of data, it can’t manage the huge volume of data that OBIEE can handle with ease. It’s very challenging to use when you have more than 25 tables or more than 16 get columns. And everything in Tableau needs to be developed from scratch.</p>



<p class="wp-block-paragraph">These two tools aren’t replacements for each other. They can do impressive work together! But getting them to work together can be challenging. If you decide to export your data from OBIEE and then import it into Tableau, you wind up duplicating data and duplicating logic as well. Your results might be inconsistent, and you expose yourself to potential security risks.&nbsp;</p>



<p class="wp-block-paragraph">That’s where BI Connector comes in!</p>



<p class="wp-block-paragraph">BI Connector is the fun, simple, and secure way to connect OBIEE and Tableau. With BI Connector, you can create your visualizations in Tableau using your OBIEE data in no time. BI Connector uses the OBIEE security model, so your data is protected. It automates the process of moving your OBIEE data into Tableau and it keeps your data safe.</p>



<p class="wp-block-paragraph">BI Connector is great for everyone on your team from IT directors and analysts to human resources, and even your sales and marketing team. It sits right on top of the OIBEE layer and allows you to easily integrate your favorite data visualization tool. It’s simple and intuitive and it bridges the gap between technologies, saving you an incredible amount of time and money. You get plug-and-play access for your users from what they’ve already built in OBIEE. You can run your results right from your data warehouse. It’s also fun and secure! You can create your visualizations in minutes while protecting your data with the OBIEE security model.</p>



<p class="wp-block-paragraph">If you’ve already invested thousands of dollars in your data tools, why waste any time trying to get those tools to work together? BI Connector is the simple way to connect your tools. You don’t have to make any changes to OBIEE or to Tableau. Plus, you’re working with a tested and secure corporate data warehouse. You get self-service data visualization with the security and governance of OBIEE.</p>



<h4 class="wp-block-heading">How to Use BI Connector to connect OBIEE and&nbsp;Tableau</h4>



<p class="wp-block-paragraph">BI Connector is incredibly easy to install. It takes less than five minutes. You can find a helpful&nbsp;<a href="https://support.biconnector.com/support/solutions/articles/8000032391-connect-tableau-desktop-to-obiee-bi-connector-desktop-edition-step-by-step-installation-guide" rel="noreferrer noopener" target="_blank">step-by-step guide</a>here and&nbsp;<a href="https://www.biconnector.com/getting-started-with-bi-connector-for-tableau/" rel="noreferrer noopener" target="_blank">really helpful videos here</a>that will walk you through the process if you run into any issues.&nbsp;</p>



<h4 class="wp-block-heading">Step 1:&nbsp;Download</h4>



<p class="wp-block-paragraph">First, you’ll need to go to the&nbsp;<a href="https://www.biconnector.com/" rel="noreferrer noopener" target="_blank">BI Connector website</a>to download BI Connector. You can click the button that says “Try it free” and drop down the menu to “Visualize OBIEE data in” and click on “Tableau.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A-ODK0y9Av8APqhSY04R4Ng.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Enter your information, click the button that says “Try BI Connector for Tableau” and your download will start. There’s no credit card or commitment required. You have 30 days to try it out and see what a game-changer it is.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AwX9ljDXGF6BhNSahzGPpnw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You’ll double click on the “BIConnector-Desktop-Edition-x64-Tableau.exe” file to unzip it and specify the location and then start the installation. You’ll click on the button that says “Install” and follow the prompts.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/2400/1%2Azi2NU_lDVLwfD9z86IrlxQ.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1200/1%2AsIhYpqIIXedFjvK6ifKKFA.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1200/1%2Ac-B_tUTMO5FnNTeN4q2HHw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">At the end of the installation process, you’ll see a popup window that lets you know that your installation has been completed. Make sure the box is checked next to the line that says “Launch ODBC Administrator” and click “ Finish.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A2XQimtM_MBQfcYYuqL0BLA.png?w=1080&#038;ssl=1" alt=""/></figure>



<h4 class="wp-block-heading">Step 2: ODBC Administrator and License Activation</h4>



<p class="wp-block-paragraph">Next, you’ll want to create a new data source. Go to the ODBC Database Administrator and click “Add.” A box will pop up where you can activate your 30 Day trial license. Enter your personal information and then either leave the trial license number as-is (it will show up automatically) or change it to the key for the license you purchased and click “Activate.” This will take you back to the window where you will now be able to create a new data source.&nbsp;</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AguRRVtebkjfLAB1aWeZCAg.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You’ll click on the button that says “Add.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A7PkNPun8DBAGP3y-COq_Hw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You’ll need to enter the data source name (OBIEE Connect here) and your server name, port, user ID, and password. This is the information you use for your Oracle BI server.&nbsp;</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>If you log in to OBIEE with something like “http://obieefunstuff.websitename.com:9704/analytics&#8221; then your server name will be “http://obieefunstuff.websitename.com&#8221; and your port will be 9704. Your user ID and password are your OBIEE user ID and password. It’s a good idea to take a look at the&nbsp;<a href="https://support.biconnector.com/support/solutions/articles/8000032391-connect-tableau-desktop-to-obiee-bi-connector-desktop-edition-step-by-step-installation-guide" rel="noreferrer noopener" target="_blank">official step-by-step guide</a>for more information. It walks you through some common scenarios, including what to do if you don’t see your port number in the URL.</p></blockquote>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AxOCwln8vJI8YEEA3DR7j2w.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Now click “Test connection” to make sure everything is working.&nbsp;</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AZRy5w9ikrsYtl3341_lvKA.png?w=1080&#038;ssl=1" alt=""/></figure>



<h4 class="wp-block-heading">Step 3: Configure access for Subject Areas or&nbsp;Reports</h4>



<p class="wp-block-paragraph">In the next window, you’ll see two little radio buttons that allow you to select either “Subject Areas” or “Reports.” Go ahead and choose one or the other and hit “Save.” You can always go to this screen again any time you want to change or update your information.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2ACpRVtoJ1du-PtXMuG0Adww.png?w=1080&#038;ssl=1" alt=""/></figure>



<h4 class="wp-block-heading">Step 4: configure Tableau</h4>



<p class="wp-block-paragraph">Now it’s time to head over to Tableau! Launch Tableau, then click “More Servers” and then go to “Other Databases (ODBC).” Now you can select the data source that you created and click “Connect.” Make sure your server name, port, and User ID are exactly the same as the ones you used previously and enter the same password here as well. Test your connection and hit “OK.” You’ll click “OK” again on the “Other Databases (ODBC) screen.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1200/1%2A2jrInrXoFlLD9Z9o88DFwg.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1200/1%2AvoGuOzpbdwWi5PZSze7SnA.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You’re ready to start visualizing your data!&nbsp;</p>



<p class="wp-block-paragraph">Go ahead and select your database from the dropdown menu and click “OK.”&nbsp;</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AgIpMiv9EGerTle45Q6DbHA.png?w=1080&#038;ssl=1" alt=""/></figure>



<h4 class="wp-block-heading">Done</h4>



<p class="wp-block-paragraph">That’s it! You’re all set!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2A0EhCgaiBRZlRLpP8V4VBGA.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You can drag and drop your information just like you normally would! Easily create filters, add color, labels, tooltips, and everything else you normally want to do with Tableau. You have your data at your fingertips and you can work with it in exactly the way you want to.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/2400/1%2ANjBEFdV-5lQDmHVTLnoV3A.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2AyuVDV80fpstJOIPGHsUHSw.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2AH-vUon7z2GMZ4YFjc316UA.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/800/1%2ARcnnMEnEQ4USSZ1o7HO7aA.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You can even blend OIBEE data with other data files right in Tableau in exactly the same way that you normally would. Start with the same process, then click “Add” in the upper left corner or drop your Data menu down to “New Data Source.” Select “Text File,” for example, if you have data in a CSV file. That will open a window where you can select your file.&nbsp;</p>



<p class="wp-block-paragraph">Now you’re all set to work with your additional data!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1200/1%2ANeuIS9eriTFP_WdjWco7fg.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/cdn-images-1.medium.com/max/1200/1%2ARLAmGT9e_kvCdfUzwGaT1g.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">It’s really that easy. You’re minutes away from being able to combine the power of OBIEE with the ease and appeal of Tableau.&nbsp;</p>



<p class="wp-block-paragraph">What are you waiting for? Give&nbsp;<a rel="noreferrer noopener" href="https://www.biconnector.com/" target="_blank">BI Connector</a> a try today!<br></p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/how-to-connect-obiee-to-tableau/">How to Effortlessly Connect OBIEE to Tableau 2019.2</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">415</post-id>	</item>
		<item>
		<title>What is Deep Learning and How Does it Work?</title>
		<link>https://contentsimplicity.com/what-is-deep-learning-and-how-does-it-work/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=what-is-deep-learning-and-how-does-it-work&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=what-is-deep-learning-and-how-does-it-work</link>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Sat, 07 Sep 2019 07:01:00 +0000</pubDate>
				<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[backpropagation]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[feedforward]]></category>
		<category><![CDATA[gradient descent]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[neural networks]]></category>
		<category><![CDATA[neuron]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[weighted sum]]></category>
		<guid isPermaLink="false">https://contentsimplicity.com/?p=795</guid>

					<description><![CDATA[<p>The inspiration for deep learning is the way that the human brain filters information. It’s literally an artificial neural network.</p>
<p>The post <a href="https://contentsimplicity.com/what-is-deep-learning-and-how-does-it-work/">What is Deep Learning and How Does it Work?</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"></p>


<p><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/@chevanon?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@chevanon?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Chevanon Photography </a>from&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/photo/close-up-of-a-siamese-fighting-fish-325045/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/close-up-of-a-siamese-fighting-fish-325045/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></em></p>
<h4>&nbsp;</h4>
<h4 class="graf graf--h4">Sit back, relax, and get comfortable with cool concepts like neural networks, gradient descent, backpropagation, and&nbsp;more.</h4>
<h3>&nbsp;</h3>
<h3 class="graf graf--h3">What is deep learning?</h3>
<p class="graf graf--p">It’s <strong class="markup--strong markup--p-strong">learning from examples</strong>. That’s pretty much the deal.</p>
<p class="graf graf--p">At a very basic level, deep learning is a machine learning technique. It teaches a computer to filter inputs through layers to learn how to predict and classify information. Observations can be in the form of images, text, or sound.</p>
<p class="graf graf--p">The inspiration for deep learning is the way that the human brain filters information. Its purpose is to mimic how the human brain works to create some real magic.</p>
<p class="graf graf--p"><em class="markup--em markup--p-em">It’s literally an artificial neural network</em>.</p>
<p class="graf graf--p">In the human brain, there are about 100 billion neurons. Each neuron connects to about 100,000 of its neighbors. We’re kind of recreating that, but in a way and at a level that works for machines.</p>
<p class="graf graf--p">In our brains, a neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and transfers to the dendrites of the next neuron. That connection where the signal passes is called a synapse.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2Aoo9SAvvcY7ni5ncC.png?w=1080&#038;ssl=1" data-image-id="0*oo9SAvvcY7ni5ncC.png" data-width="482" data-height="640"><figcaption class="imageCaption">Image by mohamed_hassan on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer" data-href="https://pixabay.com/">Pixabay</a></figcaption></figure>
<p class="graf graf--p">Neurons by themselves are kind of useless. But when you have lots of them, they work together to create some serious magic. That’s the idea behind a deep learning algorithm! You get input from observation and you put your input into one layer. That layer creates an output which in turn becomes the input for the next layer, and so on. This happens over and over until your final output signal!</p>
<p class="graf graf--p">The neuron (<strong class="markup--strong markup--p-strong">node</strong>) gets a signal or signals (<strong class="markup--strong markup--p-strong">input values</strong>), which pass through the neuron. That neuron delivers the <strong class="markup--strong markup--p-strong">output signal</strong>.</p>
<p class="graf graf--p">Think of the input layer as your senses: the things you see, smell, and feel, for example. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. You’ll need to either standardize or normalize these variables so that they’re within the same range.</p>
<p class="graf graf--p">They use many layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output of the previous layer for its input. What they learn forms a hierarchy of concepts. In this hierarchy, each level learns to transform its input data into a more and more abstract and composite representation.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2Apk2jVyLk2m9sRixz.png?w=1080&#038;ssl=1" data-image-id="0*pk2jVyLk2m9sRixz.png" data-width="640" data-height="360"><p></p>
<figcaption class="imageCaption">Image by ahmedgad on&nbsp;<a class="markup--anchor markup--figure-anchor" href="http://pixabay.com/" target="_blank" rel="noopener noreferrer" data-href="http://pixabay.com/">Pixabay</a></figcaption>
</figure>
<p class="graf graf--p">That means that for an image, for example, the input might be a matrix of pixels. The first layer might encode the edges and compose the pixels. The next layer might compose an arrangement of edges. The next layer might encode a nose and eyes. The next layer might recognize that the image contains a face, and so on.</p>
<h3>&nbsp;</h3>
<h3 class="graf graf--h3">What happens inside the&nbsp;neuron?</h3>
<p class="graf graf--p">The input node takes in information in a numerical form. The information is presented as an activation value where each node is given a number. The higher the number, the greater the activation.</p>
<p class="graf graf--p">Based on the connection strength (weights) and transfer function, the activation value passes to the next node. Each of the nodes sums the activation values that it receives (it calculates the <strong class="markup--strong markup--p-strong">weighted sum</strong>) and modifies that sum based on its transfer function. Next, it applies an activation function. An activation function is a function that’s applied to this particular neuron. From that, the neuron understands if it needs to pass along a signal or not.</p>
<p class="graf graf--p">Each of the synapses gets assigned weights, which are crucial to <strong class="markup--strong markup--p-strong">Artificial Neural Networks </strong>(ANNs). Weights are how ANNs learn. By adjusting the weights, the ANN decides to what extent signals get passed along. When you’re training your network, you’re deciding how the weights are adjusted.</p>
<p class="graf graf--p">The activation runs through the network until it reaches the output nodes. The output nodes then give us the information in a way that we can understand. Your network will use a cost function to compare the output and the actual expected output. The model performance is evaluated by the cost function. It’s expressed as the difference between the actual value and the predicted value. There are many different cost functions you can use, you’re looking at what the error you have in your network is. You’re working to minimize loss function. (In essence, the lower the loss function, the closer it is to your desired output). The information goes back, and the neural network begins to learn with the goal of minimizing the cost function by tweaking the weights. This process is called <strong class="markup--strong markup--p-strong">backpropagation</strong>.</p>
<p class="graf graf--p">In <strong class="markup--strong markup--p-strong">forward propagation</strong>, information is entered into the input layer and propagates forward through the network to get our output values. We compare the values to our expected results. Next, we calculate the errors and propagate the info backward. This allows us to train the network and update the weights. (Backpropagation allows us to adjust all the weights simultaneously.) During this process, because of the way the algorithm is structured, you’re able to adjust all of the weights simultaneously. This allows you to see which part of the error each of your weights in the neural network is responsible for.</p>
<p class="graf graf--p">When you’ve adjusted the weights to the optimal level, you’re ready to proceed to the testing phase!</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AQC88xWhhRqpW7qM-bIsRKg.jpeg?w=1080&#038;ssl=1" data-image-id="1*QC88xWhhRqpW7qM-bIsRKg.jpeg" data-width="3409" data-height="3410"><p></p>
<figcaption class="imageCaption">Photo by <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/@yogendras31?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@yogendras31?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Yogendra Singh </a>from&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/photo/photo-of-jumping-man-1701203/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/photo-of-jumping-man-1701203/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></figcaption>
</figure>
<h3>&nbsp;</h3>
<h3 class="graf graf--h3">How does an artificial neural network&nbsp;learn?</h3>
<p class="graf graf--p">There are two different approaches to get a program to do what you want. First, there’s the specifically guided and hard-programmed approach. You tell the program exactly what you want it to do. Then there are <strong class="markup--strong markup--p-strong">neural networks</strong>. In neural networks, you tell your network the inputs and what you want for the outputs, and then you let it learn on its own.</p>
<p class="graf graf--p">By allowing the network to learn on its own, you can avoid the necessity of entering in all of the rules. You can create the architecture and then let it go and learn. Once it’s trained up, you can give it a new image and it will be able to distinguish output.</p>
<h3>&nbsp;</h3>
<h3 class="graf graf--h3">Feedforward and feedback&nbsp;networks</h3>
<p class="graf graf--p">A <strong class="markup--strong markup--p-strong">feedforward </strong>network is a network that contains inputs, outputs, and hidden layers. The signals can only travel in one direction (forward). Input data passes into a layer where calculations are performed. Each processing element computes based upon the weighted sum of its inputs. The new values become the new input values that feed the next layer (feed-forward). This continues through all the layers and determines the output. Feedforward networks are often used in, for example, data mining.</p>
<p class="graf graf--p">A <strong class="markup--strong markup--p-strong">feedback network </strong>(for example, a recurrent neural network) has feedback paths. This means that they can have signals traveling in both directions using loops. All possible connections between neurons are allowed. Since loops are present in this type of network, it becomes a non-linear dynamic system which changes continuously until it reaches a state of equilibrium. Feedback networks are often used in optimization problems where the network looks for the best arrangement of interconnected factors.</p>
<h3>&nbsp;</h3>
<h3 class="graf graf--h3">What is a weighted&nbsp;sum?</h3>
<p class="graf graf--p">Inputs to a neuron can either be features from a training set or outputs from the neurons of a previous layer. Each connection between two neurons has a unique synapse with a unique weight attached. If you want to get from one neuron to the next, you have to travel along the synapse and pay the “toll” (weight). The neuron then applies an activation function to the sum of the weighted inputs from each incoming synapse. It passes the result on to all the neurons in the next layer. When we talk about updating weights in a network, we’re talking about adjusting the weights on these synapses.</p>
<h3 class="graf graf--h3">Stochastic Gradient&nbsp;Descent</h3>
<p class="graf graf--p">A neuron’s input is the sum of weighted outputs from all the neurons in the previous layer. Each input is multiplied by the weight associated with the synapse connecting the input to the current neuron. If there are 3 inputs or neurons in the previous layer, each neuron in the current layer will have 3 distinct weights: one for each synapse.</p>
<p class="graf graf--p">In a nutshell, the activation function of a node defines the output of that node.</p>
<p class="graf graf--p">The activation function (or transfer function) translates the input signals to output signals. It maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents the rate of action potential firing in the cell. It’s a number that represents the likelihood that the cell will fire. At it’s simplest, the function is binary: <strong class="markup--strong markup--p-strong">yes</strong>(the neuron fires) or <strong class="markup--strong markup--p-strong">no</strong>(the neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be anywhere in a range. If you were using a function that maps a range between 0 and 1 to determine the likelihood that an image is a cat, for example, an output of 0.9 would show a 90% probability that your image is, in fact, a cat.</p>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">What is an activation function?</strong></h3>
<p class="graf graf--p">In a nutshell, the activation function of a node defines the output of that node.</p>
<p class="graf graf--p">The activation function (or transfer function) translates the input signals to output signals. It maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents the rate of action potential firing in the cell. It’s a number that represents the likelihood that the cell will fire. At it’s simplest, the function is binary: <strong class="markup--strong markup--p-strong">yes </strong>(the neuron fires) or <strong class="markup--strong markup--p-strong">no </strong>(the neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be anywhere in a range.</p>
<p class="graf graf--p">What options do we have? There are many activation functions, but these are the four very common ones:</p>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">Threshold function</strong>&nbsp;</h3>
<p class="graf graf--p">This is a step function. If the summed value of the input reaches a certain threshold the function passes on 0. If it’s equal to or more than zero, then it would pass on 1. It’s a very rigid, straightforward, yes or no function.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2ALLWTicPcC3T1X9X5.png?w=1080&#038;ssl=1" data-image-id="0*LLWTicPcC3T1X9X5.png" data-width="288" data-height="216"><p></p>
<figcaption class="imageCaption"><em>Example threshold function</em></figcaption>
</figure>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">Sigmoid function</strong></h3>
<p class="graf graf--p">This function is used in logistic regression. Unlike the threshold function, it’s a smooth, gradual progression from 0 to 1. It’s useful in the output layer and is used heavily for linear regression.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2ANHsOkUbb8c9V06ZQ.png?w=1080&#038;ssl=1" data-image-id="0*NHsOkUbb8c9V06ZQ.png" data-width="288" data-height="183"><p></p>
<figcaption class="imageCaption"><em>Example sigmoid&nbsp;function</em></figcaption>
</figure>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">Hyperbolic Tangent&nbsp;Function</strong>&nbsp;</h3>
<p class="graf graf--p">This function is very similar to the sigmoid function. But unlike the sigmoid function which goes from 0 to 1, the value goes below zero, from -1 to 1. Even though this isn’t a lot like what happens in a brain, this function gives better results when it comes to training neural networks. Neural networks sometimes get “stuck” during training with the sigmoid function. This happens when there’s a lot of strongly negative input that keeps the output near zero, which messes with the learning process.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2A5MFqcP1Qdn6htYAC.png?w=1080&#038;ssl=1" data-image-id="0*5MFqcP1Qdn6htYAC.png" data-width="288" data-height="187"><p></p>
<figcaption class="imageCaption"><em>Example hyperbolic tangent function&nbsp;(tanh)</em></figcaption>
</figure>
<h3 class="graf graf--h3"><strong class="markup--strong markup--h3-strong">Rectifier function</strong>&nbsp;</h3>
<p class="graf graf--p">This might be the most popular activation function in the universe of neural networks. It’s the most efficient and biologically plausible. Even though it has a kink, it’s smooth and gradual after the kink at 0. This means, for example, that your output would be either “no” or a percentage of “yes.” This function doesn’t require normalization or other complicated calculations.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2A6h-5AslRa-oWwAie.png?w=1080&#038;ssl=1" data-image-id="0*6h-5AslRa-oWwAie.png" data-width="288" data-height="175"><p></p>
<figcaption class="imageCaption"><em>Example rectifier function</em></figcaption>
</figure>
<h3 class="graf graf--h3">What?</h3>
<p class="graf graf--p">So let’s say, for example, your desired value is binary. You’re looking for a “yes” or a “no.” Which activation function do you want to use?&nbsp;</p>
<p class="graf graf--p">From the above examples, you could use the threshold function or you could go with the sigmoid activation function. The threshold function would give you a “yes” or “no” (1 or 0). The sigmoid function would be able to give you the probability of a yes.</p>
<p class="graf graf--p">If you were using a sigmoid function to determine how likely it is that an image is a cat, for example, an output of 0.9 would show a 90% probability that your image is, in fact, a cat.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/0%2A0g4Clzsw03sQLHI6.jpeg?w=1080&#038;ssl=1" data-image-id="0*0g4Clzsw03sQLHI6.jpeg" data-width="640" data-height="359"><p></p>
<figcaption class="imageCaption"><em>Photo by minanafotos on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer" data-href="https://pixabay.com/">Pixabay</a></em></figcaption>
</figure>
<p class="graf graf--p"><em class="markup--em markup--p-em">Want to dive deeper? Check out </em><a class="markup--anchor markup--p-anchor" href="http://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf" target="_blank" rel="noopener noreferrer" data-href="http://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf"><em class="markup--em markup--p-em">Deep Sparse Rectifier Neural Networks</em></a><em class="markup--em markup--p-em">by Xavier Glorot, et al.</em></p>
<h3 class="graf graf--h3">How do you adjust the&nbsp;weights?</h3>
<p class="graf graf--p">You could use a brute force approach to adjust the weights and test thousands of different combinations. But even with the most simple neural network that has only five input values and a single hidden layer, you’ll wind up with 10⁷⁵ possible combinations.&nbsp;</p>
<p class="graf graf--p">Running this on the world’s fastest supercomputer would take longer than the universe has existed so far.</p>
<h3 class="graf graf--h3">Enter gradient&nbsp;descent</h3>
<p class="graf graf--p">But if you go with <strong class="markup--strong markup--p-strong">gradient descent</strong>, you can look at the angle of the slope of the weights and find out if it’s positive or negative in order to continue to slope downhill to find the best weights on your quest to reach the global minimum.</p>
<p class="graf graf--p">If you go with <strong class="markup--strong markup--p-strong">gradient descent</strong>, you can look at the angle of the slope of the weights and find out if it’s positive or negative. This allows you to continue to slope downhill to find the best weights on your quest to reach the global minimum.</p>
<figure class="graf graf--figure"><img data-recalc-dims="1" decoding="async" class="graf-image" src="https://i0.wp.com/cdn-images-1.medium.com/max/1600/1%2AIOaN68A9G9-ReSjZBQ_LZw.jpeg?w=1080&#038;ssl=1" data-image-id="1*IOaN68A9G9-ReSjZBQ_LZw.jpeg" data-width="3000" data-height="3751"><p></p>
<figcaption class="imageCaption"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/@ranjan-simkhada-1263037?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/@ranjan-simkhada-1263037?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">RANJAN SIMKHADA </a>from&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://www.pexels.com/photo/man-sitting-on-a-mountain-cliff-2402891/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels" target="_blank" rel="noopener noreferrer" data-href="https://www.pexels.com/photo/man-sitting-on-a-mountain-cliff-2402891/?utm_content=attributionCopyText&amp;utm_medium=referral&amp;utm_source=pexels">Pexels</a></em></figcaption>
</figure>
<p class="graf graf--p"><strong class="markup--strong markup--p-strong">Gradient descent </strong>is an algorithm for finding the minimum of a function. The analogy you’ll see over and over is that of someone stuck on top of a mountain and trying to get down (find the minima). There’s heavy fog making it impossible to see the path, so she uses gradient descent to get down to the bottom of the mountain. She looks at the steepness of the hill where she is and proceeds down in the direction of the steepest descent. You should assume that the steepness isn’t immediately obvious. Luckily, she has a tool that can measure steepness!</p>
<p class="graf graf--p">Unfortunately, this tool takes forever.</p>
<p class="graf graf--p">She wants to use it as infrequently as she can to get down the mountain before dark. The real difficulty is choosing how often she wants to use her tool so she doesn’t go off track.</p>
<p class="graf graf--p">In this analogy, the person is the algorithm. The steepness of the hill is the slope of the error surface at that point. The direction she goes is the gradient of the error surface at that point. The tool she’s using is differentiation (the slope of the error surface can be calculated by taking the derivative of the squared error function at that point). The rate at which she travels before taking another measurement is the learning rate of the algorithm. It’s not a perfect analogy, but it gives you a good sense of what gradient descent is all about. The machine is learning the gradient, or direction, that the model should take to reduce errors.</p>
<p class="graf graf--p">Gradient descent requires the cost function to be convex, but what if it isn’t?</p>
<p class="graf graf--p">Normal gradient descent will get stuck at a local minimum rather than a global minimum, resulting in a subpar network. In normal gradient descent, we take all our rows and plug them into the same neural network, take a look at the weights, and then adjust them. This is called batch gradient descent. In stochastic gradient descent, we take the rows one by one, run the neural network, look at the cost functions, adjust the weights, and then move to the next row. Essentially, you’re adjusting the weights for each row.</p>
<p class="graf graf--p"><strong class="markup--strong markup--p-strong">Stochastic gradient descent </strong>has much higher fluctuations, which allows you to find the global minimum. It’s called “stochastic” because samples are shuffled randomly, instead of as a single group or as they appear in the training set. It looks like it might be slower, but it’s actually faster because it doesn’t have to load all the data into memory and wait while the data is all run together. The main pro for batch gradient descent is that it’s a deterministic algorithm. This means that if you have the same starting weights, every time you run the network you will get the same results. Stochastic gradient descent is always working at random. (You can also run mini-batch gradient descent where you set a number of rows, run that many rows at a time, and then update your weights.)</p>
<p class="graf graf--p">Many improvements on the basic stochastic gradient descent algorithm have been proposed and used, including implicit updates (ISGD), momentum method, averaged stochastic gradient descent, adaptive gradient algorithm (AdaGrad), root mean square propagation (RMSProp), adaptive moment estimation (Adam), and more.</p>
<p class="graf graf--p">So here’s a quick walkthrough of training an artificial neural network with stochastic gradient descent:</p>
<ul class="postList">
<li class="graf graf--li">Randomly initiate weights to small numbers close to 0</li>
<li class="graf graf--li">Input the first observation of your dataset into the input layer, with each feature in one input node.</li>
<li class="graf graf--li"><strong class="markup--strong markup--li-strong">Forward propagation</strong> — from left to right, the neurons are activated in a way that each neuron’s activation is limited by the weights. You propagate the activations until you get the predicted result.</li>
<li class="graf graf--li">Compare the predicted result to the actual result and measure the generated error.</li>
<li class="graf graf--li"><strong class="markup--strong markup--li-strong">Backpropagation</strong> — from right to left, the error is back propagated. The weights are updated according to how much they are responsible for the error. (The learning rate decides how much we update the weights.)</li>
<li class="graf graf--li"><strong class="markup--strong markup--li-strong">Reinforcement learning</strong>(repeat steps 1–5 and update the weights after each observation) <strong class="markup--strong markup--li-strong">OR</strong><strong class="markup--strong markup--li-strong">batch learning</strong>(repeat steps 1–5, but update the weights only after a batch of observations).</li>
<li class="graf graf--li">When the whole training set has passed through the ANN, that is one epoch. Repeat with more epochs.</li>
</ul>
<h4 class="graf graf--h4">There you have it! Those are the basic ideas behind what’s happening in an artificial neural network.</h4>
<p class="graf graf--h4">Congratulations! Now you know what deep learning is and how it works!</p>
<p class="graf graf--p"><em class="markup--em markup--p-em">Hungry for more? You might want to read </em><a class="markup--anchor markup--p-anchor" href="http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf" target="_blank" rel="noopener noreferrer" data-href="http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf"><em class="markup--em markup--p-em">Efficient BackProp</em></a><em class="markup--em markup--p-em">by Yann LeCun, et al., as well as </em><a class="markup--anchor markup--p-anchor" href="http://neuralnetworksanddeeplearning.com/" target="_blank" rel="noopener noreferrer" data-href="http://neuralnetworksanddeeplearning.com/"><em class="markup--em markup--p-em">Neural Networks and Deep Learning</em></a><em class="markup--em markup--p-em">by Michael Nielsen.</em><em class="markup--em markup--p-em">If you’re interested in learning more about cost functions, check out</em><a class="markup--anchor markup--p-anchor" href="https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-alongside-applications" target="_blank" rel="noopener noreferrer" data-href="https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-alongside-applications"><em class="markup--em markup--p-em">A List of Cost Functions Used in Neural Networks, Alongside Applications</em></a><em class="markup--em markup--p-em">.&nbsp;</em></p>
<p class="graf graf--p"><em class="markup--em markup--p-em">You might also want to check out this one:</em></p>
<div class="graf graf--mixtapeEmbed"><a href="https://contentsimplicity.com/wft-is-image-classification/" target="_blank" rel="noopener noreferrer"><b>Convolutional Neural Networks and Image Classification</b></a></div>
<p class="graf graf--p">Thanks for reading! As always, if you do anything cool with this information, leave a comment in the notes below or reach out on LinkedIn <a class="markup--anchor markup--p-anchor" href="https://www.linkedin.com/in/annebonnerdata/" target="_blank" rel="noopener noreferrer" data-href="https://www.linkedin.com/in/annebonnerdata/">@annebonnerdata.</a></p>


<div style = "display:none;"> <figure class="wp-block-image size-large"><img data-recalc-dims="1" fetchpriority="high" decoding="async" width="683" height="1024" data-attachment-id="960" data-permalink="https://contentsimplicity.com/what-is-deep-learning-and-how-does-it-work/what-is-deep-learning_/" data-orig-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/What-is-deep-learning_.png?fit=735%2C1102&amp;ssl=1" data-orig-size="735,1102" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="What is deep learning_" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/What-is-deep-learning_.png?fit=200%2C300&amp;ssl=1" data-large-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/What-is-deep-learning_.png?fit=683%2C1024&amp;ssl=1" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/What-is-deep-learning_.png?resize=683%2C1024&#038;ssl=1" alt="" class="wp-image-960" srcset="https://contentsimplicity.com/wp-content/uploads/2019/09/What-is-deep-learning_-683x1024.png 683w, https://contentsimplicity.com/wp-content/uploads/2019/09/What-is-deep-learning_-480x720.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) 683px, 100vw" /></figure></div>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/what-is-deep-learning-and-how-does-it-work/">What is Deep Learning and How Does it Work?</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">795</post-id>	</item>
		<item>
		<title>Your Mobile Banking App has a Problem</title>
		<link>https://contentsimplicity.com/your-mobile-banking-app-has-a-problem/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=your-mobile-banking-app-has-a-problem&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=your-mobile-banking-app-has-a-problem</link>
					<comments>https://contentsimplicity.com/your-mobile-banking-app-has-a-problem/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Sun, 02 Jun 2019 20:24:45 +0000</pubDate>
				<category><![CDATA[data]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[banking]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[error]]></category>
		<category><![CDATA[finance]]></category>
		<category><![CDATA[image capture]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<guid isPermaLink="false">http://contentsimplicity.com/?p=447</guid>

					<description><![CDATA[<p>Errors in machine learning algorithms are creating critical (and nearly invisible) consequences.</p>
<p>The post <a href="https://contentsimplicity.com/your-mobile-banking-app-has-a-problem/">Your Mobile Banking App has a Problem</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><div class="et_pb_section et_pb_section_0 et_section_regular" >
				
				
				
				
				
				
				<div class="et_pb_row et_pb_row_0">
				<div class="et_pb_column et_pb_column_4_4 et_pb_column_0  et_pb_css_mix_blend_mode_passthrough et-last-child">
				
				
				
				
				<div class="et_pb_module et_pb_text et_pb_text_0  et_pb_text_align_left et_pb_bg_layout_light">
				
				
				
				
				<div class="et_pb_text_inner"><h3>Errors in machine learning algorithms are creating critical (and nearly invisible) consequences</h3>
<p><em><a href="https://towardsdatascience.com/your-mobile-banking-app-has-a-problem-c2fe006e76c7">(This article first appeared in Towards Data Science)</a></em></p>
<p>Do you use a mobile banking app?</p>
<p><span style="font-size: 14px;"><strong>We have almost certainly paid thousands, if not millions, of dollars for returned checks that aren’t actually bad.</strong></span></p>
<p><strong style="font-size: 14px;">I’m not sure if anyone is aware of it.</strong></p>
<p><!-- divi:paragraph -->It is impossible that I’m the only person that this has happened to.</p>
<p><!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->But it’s easy to see how we might all be missing it.</p>
<p><!-- divi:image {"linkDestination":"custom"} --></p>
<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*wZqZFYWUGcijFY7y" alt="" /><figcaption><em>Photo by <a href="https://unsplash.com/@ryoji__iwata?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noreferrer noopener">Ryoji Iwata</a> on <a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noreferrer noopener">Unsplash</a></em></figcaption></figure>
<p><!-- divi:paragraph -->The technology behind mobile banking is pretty incredible, but what happens when there’s a mistake?</p>
<p><!-- divi:paragraph -->What happens if we don’t see the mistake?</p>
<p><!-- divi:paragraph -->We’re living in a world where so many technological advancements have been made that they almost blend into the background. We’ve gotten used to the idea that we can let our phones and computers do the little things for us. It’s easy to forget how new all of this technology really is.</p>
<p><!-- divi:paragraph -->But it is new. It’s changing every day. There are algorithms behind most of the basic things that you take for granted, from social media and entertainment to banking and finances. They are constantly evolving.<!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->They are not perfect.</p>
<p><!-- divi:paragraph --><strong>Pay attention!</strong></p>
<p><!-- divi:paragraph -->If an image capture system makes an error within your banking app that causes your deposit to be rejected, what will that cost you?</p>
<p><!-- divi:paragraph -->What if no one sees it? What will that cost us all?</p>
<p><!-- divi:image {"linkDestination":"custom"} --></p>
<figure class="wp-block-image"><a href="https://medium.com/@annebonner" target="_blank" rel="noreferrer noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/06/d61c9-1fc1i0erhdu4bgomy9710bg.jpeg?w=1080&#038;ssl=1" alt="" /></a><figcaption></figcaption></figure>
<h4 id="8f00">This is not a bad check!</h4>
<p><!-- divi:paragraph -->Recently, I deposited a check on a mobile banking app that was accepted, only to be returned as a bad check a few days later. I was charged a fee for this.</p>
<p><!-- divi:paragraph -->Here’s the problem: <strong>that check was not bad</strong>.<!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->Here’s the bigger problem: this took almost two weeks to sort out and involved a huge amount of wasted time. It was practically by chance that I even caught the actual issue. There was an error in the image capturing system within a mobile banking app. An easy-to-miss mistake that may have already happened to you without you even being aware of it.</p>
<p><!-- divi:paragraph -->There are a lot of perks to working as a freelancer, but the financial side can get complicated. Rather than a steady stream of checks from a single source, you’re depending on a variety of clients to send you payments from different accounts on what can be a pretty random schedule.</p>
<p><!-- divi:paragraph -->I’m lucky enough to have amazing clients who I trust completely. Not that mistakes can’t happen! They absolutely can. People lose track of their balances, grab checks from the wrong account or a closed account, and so on. But I’ve been incredibly fortunate to have never yet had a client send a bad check.</p>
<p><!-- divi:paragraph -->You can imagine my surprise when, days after having a mobile deposit approved through the banking app for a major bank, that check was returned and my account was hit with a sizable fee. This is a client I’ve known and trusted for years, and it didn’t make sense. I reached out to the client, and he was as baffled by the situation as I was.</p>
<p><!-- divi:paragraph -->I contacted the bank and spent a very long time on the phone with someone who assured me that there was nothing that she could do. It was a bad check. When someone writes you a bad check, you have to pay a fee. You’re expected to have your client reimburse you for that fee. End of story. Your bank can’t possibly go to some other bank and demand that they pay the fee, so it’s up to you. Just go back to your client and get a new check.</p>
<p><!-- divi:paragraph -->Sounds legit, right?</p>
<h4 id="503f">Bad news, banker</h4>
<p><!-- divi:paragraph -->Unfortunately for her, I wasn’t going to let this go. Here’s where I have the advantage over a lot of other people out there who might take what she’s saying at face value. People who would simply return to the client and request a new check. One with added charges to cover the fee and possibly interest as well:</p>
<ul>
<li>I know and trust my client.</li>
<li>I was holding the check in my hand and could see that it was drawn on an account from this same bank. (This suggested that some information that she had was incorrect.)</li>
<li>I am familiar with some of the algorithms driving image capture and classification as well as their potential weak spots. (<a href="https://medium.com/@annebonner" target="_blank" rel="noreferrer noopener">I actually write about technology and artificial intelligence here on Medium.</a>)</li>
<li>I’m aware that this bank recently did a major tech upgrade.</li>
</ul>
<p><em>I used to be a personal banker with this same bank</em>. It was a long time ago, but I know what information a banker has access to and what steps she can and can’t take.</p>
<p>It’s unlikely that you are in this position and that’s why I’m writing this story.</p>
<p>How many people do think have simply gone back after depositing a check on a banking app, gotten a new check, and paid the fee without identifying the actual problem?</p>
<p><!-- divi:paragraph --><strong>You will believe your banker.</strong> You will believe that you received a bad check and proceed from there. Your client will need to provide another check with additional fees. It could affect your relationship with them. The returned check may cause you to overdraw your account, incurring more fees and much larger problems. Multiple returned checks in a short period can cause you to lose your account. A lot of things can go very badly here, all because of an error in a machine learning algorithm.</p>
<p><!-- divi:paragraph -->I want you to have this information. I want you to know what you’re looking for. You can and should ask questions. Is the problem really with the banking app and not with the check? There are a lot of things a banker can’t tell you but plenty of information is available to you. Was the check returned because of insufficient funds? Is this an account that doesn’t exist? What other steps can you take?</p>
<p><!-- divi:paragraph -->What is really going on here?</p>
<p><!-- divi:paragraph -->Ask the questions!</p>
<p><!-- divi:image {"linkDestination":"custom"} --></p>
<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*kmeK2QEAhQSzqpcI" alt="" /><figcaption><em>Photo by <a href="https://unsplash.com/@art_maltsev?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noreferrer noopener">Artem Maltsev</a> on <a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noreferrer noopener">Unsplash</a></em></figcaption></figure>
<h4 id="7da3">So what happened?</h4>
<p><!-- divi:paragraph -->That was surprisingly hard to figure out. I’m sure the banker on the phone hopes she never hears from me again. Eventually, it turned out that the account that the check was drawn on was not the account number entered into the system. That means that the bank couldn’t locate the bank account in question.<!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->I had to wait for the bank to send me a certified copy of my deposit so that I could take that and the original check to a bank for a banker to examine in person. At that point, the banker could try to figure out the problem and determine whether or not to reverse the fee. (For added fun, your fee can generally only be reversed by the bank where you opened your account. That means that I’d have to wait for a banker I’ve never spoken with halfway across the country to decide whether to reverse my fee.)</p>
<p><!-- divi:paragraph -->So, I waited for the mail.<!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->It took close to a week. It’s not hard to imagine what this would be like for someone who was now overdrawn on their account because of a mistake entirely on the bank’s end.</p>
<h4 id="04f5">Take it to the bank</h4>
<p><!-- divi:paragraph -->Once the check arrived and my toddler could be spared from the world’s most boring adventure, I headed over to the bank and went through all of this again in person.</p>
<p><!-- divi:paragraph -->This banker said once again that this was a bad account. The customer definitely existed, but the account wasn’t an open account. It must be one that had been closed and my client had grabbed some old checks by mistake. I’d just have to get a new check.</p>
<p><!-- divi:paragraph -->Easy mistake. It could happen to anyone.</p>
<p><!-- divi:paragraph -->This was all so plausible, but I know my client. I also know routing numbers. This guy was not the sort of person who would be holding on to a box of checks from a closed account that he opened in a state where he lived seven years ago. Not happening.</p>
<p><!-- divi:paragraph -->Somehow, even though the banker knew with absolute certainty that she was in the right, she looked again.</p>
<p><!-- divi:paragraph -->She saw the problem. It was so simple.</p>
<p><!-- divi:paragraph --><strong>The image capture system  that the banking app uses had cut off the last two digits of the account number on the check.</strong></p>
<p><!-- divi:paragraph -->That’s it.</p>
<p><!-- divi:paragraph -->That mistake, both tiny and massive, caused a returned check, a fee, and nearly two week’s worth of headaches and wasted time that could have been productive.</p>
<p><!-- divi:paragraph -->It could have been far worse.</p>
<p><!-- divi:quote --></p>
<blockquote class="wp-block-quote"><p><em>That bank error was practically invisible.</em></p></blockquote>
<h4 id="4b43">Enter artificial intelligence</h4>
<p><!-- divi:paragraph -->So where does this image capture system come from? Is it unique to this bank?</p>
<p><!-- divi:paragraph -->It turns out that most of the major banks all use the same company for the image capture, recognition, and analysis systems within their banking apps. This company does incredible work and I’m in no way questioning that. I have zero proof that they are directly the cause of this issue.</p>
<p><!-- divi:paragraph -->The company uses artificial intelligence to develop algorithms for image recognition. They’re using machine learning algorithms to do incredible things with document and ID verification. They’ve created an image capture software development system built on computer vision and machine learning algorithms. It detects corners and glare, can detect and analyze images on a variety of backgrounds, contains built-in analytics, offers real-time image assessment, and has a lot of other cool features.</p>
<p><!-- divi:paragraph -->But it’s not, apparently, flawless.</p>
<p><!-- divi:paragraph -->That said, I don’t believe that this company is directly the cause of the problem. I know that the bank I use for my freelance work has recently undergone a major tech redesign and they’ve made changes to their online and mobile banking app. It might be possible that the redesign on the bank’s end caused a problem with the image capture system.</p>
<p><!-- divi:paragraph -->It also might be possible that there is a problem with the image capture system itself. I’m having a hard time locating any records of errors on the tech company’s part, but that doesn’t necessarily mean that there aren’t any. They might have an amazing PR department or a stellar legal team. Or there might not be a lot of other people who happen to be in a position to notice exactly what happened here.</p>
<p>I have no way of knowing where the fault lies or how often this happens. But it’s <strong>impossible</strong> that this has only happened one time.</p>
<p><!-- divi:paragraph -->This is potentially a massive problem. Even if 0.1% of customers who use mobile banking apps are having (or will have) this issue, that’s a huge problem within our economy. <a href="https://www.federalreserve.gov/econres/notes/feds-notes/mobile-banking-a-closer-look-at-survey-measures-20180327.htm" target="_blank" rel="noreferrer noopener">An enormous number of people use mobile banking apps</a> and that number is growing. The company that makes this technology also builds technology that’s used in ID and document scanning. Can it just drop numbers? These are the numbers that represent our bank accounts and our identities. This kind of mistake is not acceptable and will have extremely serious ramifications.</p>
<h4 id="0c7f">What’s the solution?</h4>
<p><!-- divi:paragraph -->I don’t know yet, but I know we need one. We definitely need to start training bankers to watch for this problem. Banking apps are not perfect. Remember that all of this technology is still new. Remember that you have the right to ask questions. You have the right to get to the bottom of the situation, even when people tell you that you are already there. This is almost certainly happening everywhere and we need to find a way to fix this problem.<!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->It’s up to all of us to pay attention. No one is going to solve this problem if they don’t know about it.</p>
<p><!-- divi:paragraph --><strong>Don’t let this slide. It’s too important.</strong><!-- /divi:paragraph --></p>
<p><!-- divi:paragraph -->If anyone else has had the same issue, feel free to discuss it in the comments below. As always, reach out any time on LinkedIn <a href="https://www.linkedin.com/in/annebonnerdata/" target="_blank" rel="noreferrer noopener">@annebonnerdata</a>.</p>
<p><!-- divi:block {"ref":451} /--></p></div>
			</div>
			</div>
				
				
				
				
			</div>
				
				
			</div></p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/your-mobile-banking-app-has-a-problem/">Your Mobile Banking App has a Problem</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/your-mobile-banking-app-has-a-problem/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">447</post-id>	</item>
		<item>
		<title>Multiple Linear Regression in 4 Lines of Code!</title>
		<link>https://contentsimplicity.com/machine-learning-multiple-linear-regression/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=machine-learning-multiple-linear-regression&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=machine-learning-multiple-linear-regression</link>
					<comments>https://contentsimplicity.com/machine-learning-multiple-linear-regression/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Thu, 30 May 2019 20:25:08 +0000</pubDate>
				<category><![CDATA[data]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[featured]]></category>
		<category><![CDATA[linear regression]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[statistics]]></category>
		<guid isPermaLink="false">http://contentsimplicity.com/?p=422</guid>

					<description><![CDATA[<p>Conquer the basics of multiple linear regression (and backward elimination!) and use your data to predict the future!</p>
<p>The post <a href="https://contentsimplicity.com/machine-learning-multiple-linear-regression/">Multiple Linear Regression in 4 Lines of Code!</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h3 class="wp-block-heading" id="a4ee">Conquer the basics of multiple linear regression (and backward elimination!) and use your data to predict the&nbsp;future!</h3>



<p class="wp-block-paragraph"><em>(This article first appeared in </em><a href="https://towardsdatascience.com/multiple-linear-regression-in-four-lines-of-code-b8ba26192e84"><em>Towards Data Science</em></a><em>)</em></p>



<p class="wp-block-paragraph">Multiple linear regression is a lot of fun. Being able to predict the future is awesome.</p>



<p class="wp-block-paragraph">You might want to predict how well a stock will do based on some other information that you just happen to have.</p>



<p class="wp-block-paragraph">It might help you to know if how often you bathe and how many cats you have relates to how long you’ll live.</p>



<p class="wp-block-paragraph">You might want to figure out if there’s a relationship between a man who 1.) calls his mom more than three times a day, 2.) refers to another man as “bro,” 3.) has never done his own laundry and above-average divorce rates.</p>



<p class="wp-block-paragraph">Multiple linear regression might be for you!</p>



<p class="wp-block-paragraph">Multiple linear regression is fun because it looks at the relationships within a bunch of information. Instead of just looking at how&nbsp;<strong>one</strong>&nbsp;thing relates to another thing (simple linear regression), you can look at the relationship between a lot of different things and the thing you want to predict.</p>



<p class="wp-block-paragraph">A&nbsp;<strong>linear regression model</strong>&nbsp;is a statistical model that’s frequently used in data science. It’s also one of the basic building blocks of machine learning!&nbsp;<strong>Multiple linear regression</strong>&nbsp;(MLR/multiple regression) is a statistical technique. It can use several variables to predict the outcome of a different variable. The&nbsp;<strong>goal of multiple regression is to model the linear relationship between your independent variables and your dependent variable</strong>. It looks at how multiple independent variables are related to a dependent variable.</p>



<p class="wp-block-paragraph">I’m going to assume that you know a little bit about simple linear regression. If you don’t, check out&nbsp;<a rel="noreferrer noopener" href="https://towardsdatascience.com/simple-linear-regression-in-four-lines-of-code-d690fe4dba84" target="_blank">this article on building a simple linear regressor</a>. It will give you a quick (and fun) walk-through of the basics.</p>



<p class="wp-block-paragraph"><strong>Simple linear regression</strong>&nbsp;is what you can use when you have one independent variable and one dependent variable.&nbsp;<strong>Multiple linear regression</strong>&nbsp;is what you can use when you have a bunch of different independent variables!</p>



<p class="wp-block-paragraph">Multiple regression analysis has three main uses.</p>



<ul class="wp-block-list"><li>You can look at the strength of the effect of the independent variables on the dependent variable.</li><li>You can use it to ask how much the dependent variable will change if the independent variables are changed.</li><li>You can also use it to predict trends and future values.</li></ul>



<p class="wp-block-paragraph">Let’s do that one!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/a6703-1gw9dnyxtyu4as3d77e7prg.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Image by&nbsp;<a href="https://pixabay.com/users/RondellMelling-57942/" rel="noreferrer noopener" target="_blank">RondellMelling</a>&nbsp;via&nbsp;<a href="http://pixabay.com/" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>We’re going to keep things super simple here so that multiple linear regression as a whole makes sense. I do want you to know that things can get a lot more complex than this in the real world.</p></blockquote>



<h4 class="wp-block-heading" id="5f7b">How do I&nbsp;begin?</h4>



<p class="wp-block-paragraph">For the purposes of this post, you are now working for a venture capitalist.</p>



<p class="wp-block-paragraph">Congratulations!</p>



<p class="wp-block-paragraph">So here’s the thing: you have a dataset in front of you with information on 50 companies. You have five columns that contain information about how much those companies spend on admin, research and development (R&amp;D), and marketing, their location by state, and their profit for the most recent year. This dataset is anonymized, which means we don’t know the names of these companies or any other identifying information.</p>



<p class="wp-block-paragraph">You’ve been hired to analyze this information and create a model. You need to inform the guy who hired you what kind of companies will make the most sense in the future to invest in. To keep things simple, let’s say that your employer wants to make this decision based on last year’s profit. This means that the profits column is your&nbsp;<strong>dependent variable</strong>. The other columns are the&nbsp;<strong>independent variables</strong>.</p>



<p class="wp-block-paragraph">So you want to learn about the&nbsp;<strong>dependent variable</strong>&nbsp;(profit) based on the other categories of information you have.</p>



<p class="wp-block-paragraph">The guy who hired you doesn’t want to invest in these specific companies. He wants to use the information in this dataset as a sample. This sample will help him understand which of the companies he looks at in the future will perform better based on the same information.</p>



<p class="wp-block-paragraph">Does he want to invest in companies that spend a lot on R&amp;D? Marketing? Does he want to invest in companies that are based in Illinois? You need to help him create a set of guidelines. You’re going to help him be able to say something along the lines of, “I’m interested in a company that’s based in New York that spends very little on admin expenses but a lot on R&amp;D.”</p>



<p class="wp-block-paragraph">You’re going to come up with a model that will allow him to assess where and into which companies he wants to invest to maximize his profit.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Linear regression is great for correlation, but remember that&nbsp;<strong>correlation and causation are not the same things</strong>! You are not saying that one thing causes the other, you’re finding which independent variables are strongly correlated to the dependent variable.</p></blockquote>



<p class="wp-block-paragraph">There are some assumptions that absolutely have to be true:</p>



<ul class="wp-block-list"><li>There is a linear relationship between the dependent variable and the independent variables.</li><li>The independent variables aren’t too highly correlated with each other.</li><li>Your observations for the dependent variable are selected independently and at random.</li><li>Regression residuals are normally distributed.</li></ul>



<p class="wp-block-paragraph">You need to check that these assumptions are true before you proceed and build your model. We’re totally skipping past that here. Make sure that if you’re doing this in the real world, you aren’t just blindly following this tutorial. Those assumptions need to be correct when you’re building your regression!</p>



<h4 class="wp-block-heading" id="58aa">Dummy variables</h4>



<p class="wp-block-paragraph">If you aren’t familiar with the concept of dummy variables, check out&nbsp;<a rel="noreferrer noopener" href="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d" target="_blank">this article on data cleaning and preprocessing</a>. It has some simple code that we can go ahead and copy and paste here.</p>



<p class="wp-block-paragraph">So we’ve already decided that “profit” is our dependent variable (<strong>y</strong>) and the others are our independent variables (<strong>X</strong>). We’ve also decided that what we want is a linear regression model. What about that column of states? “State” is a categorical variable, not a numerical variable. We need our independent variables to be numbers, not words. What do we do?</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/42e03-16ayoo0pjmeismqfzu-bifq.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by&nbsp;<a href="https://pixabay.com/users/3dman_eu-1553824/" rel="noreferrer noopener" target="_blank">3dman_eu</a>&nbsp;via&nbsp;<a href="http://pixabay.com/" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<h4 class="wp-block-heading" id="26eb">Let’s create a dummy variable!</h4>



<p class="wp-block-paragraph">If you looked at the information in the locations column, you might see that all of the companies that are being examined are based in two states. For the purposes of this explanation, let’s say all of our companies are located in either New York or Minnesota. That means that we’ll want to turn this one column of information into two columns of 1s and 0s. (If you want to learn more about why we’re doing that, check out&nbsp;<a href="https://towardsdatascience.com/simple-linear-regression-in-four-lines-of-code-d690fe4dba84" target="_blank" rel="noreferrer noopener">that article on simple linear regression</a>. It explains why this would be the best way to arrange our data.)</p>



<p class="wp-block-paragraph">So how do we populate those columns? Basically, we’ll turn each state into its own column. If a company is located in New York, it will have a&nbsp;<strong>1</strong>&nbsp;in the “New York” column and a&nbsp;<strong>0</strong>&nbsp;in the “Minnesota” column. If you were using more states, you’d have a 1 in the New York column, and, for example, a 0 in the “California” column, a zero in the “Illinois” column, a 0 in the Arkansas column, and so on. We won’t be using the original “locations” column anymore because we won’t need it!</p>



<p class="wp-block-paragraph">These 1s and 0s are basically working as a light switch. 1 is “on” or “yes” and 0 is “off” or “nope.”</p>



<h4 class="wp-block-heading" id="5b09">Beware the dummy variable&nbsp;trap</h4>



<p class="wp-block-paragraph">You never want to include both variables at the same time.</p>



<p class="wp-block-paragraph">Why is that?</p>



<p class="wp-block-paragraph">You’d be duplicating a variable. The first variable (d1) is always equal to 1 minus the second variable (d2). (<strong>d1 = 1-d2</strong>) When one variable predicts another, it’s called&nbsp;<strong>multicollinearity</strong>. As a result, the model wouldn’t be able to distinguish the results of d1 from the results of d2. You can’t have the constant and both dummy variables at the same time. If you have nine variables, include eight of them. (If you have two sets of dummy variables, then you have to do this for each set.)</p>



<h4 class="wp-block-heading" id="7b06">What is the&nbsp;P-value?</h4>



<p class="wp-block-paragraph">You’re going to want to be familiar with the concept of a P-value. That’s definitely going to come up.</p>



<p class="wp-block-paragraph">The P-value is the probability of getting a sample like ours (or more extreme than ours) if the null hypothesis is true.</p>



<p class="wp-block-paragraph"><strong>It gives a value to the weirdness of your sample</strong>. If you have a large P-value, then you probably won’t change your mind about the null hypothesis. A large value means that it wouldn’t be at all surprising to get a sample like yours if the hypothesis is true. As the P-value gets smaller, you should probably start to ask yourself some questions. You might want to change your mind and maybe even reject the hypothesis.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>The&nbsp;<strong>null hypothesis</strong>&nbsp;is the official way to refer to the claim (hypothesis) that’s on trial here. It’s the default position where there’s just no association among the groups that are being tested. In every experiment, you’re looking for an effect among the groups that are being tested. Unfortunately, there’s always the possibility that there’s no effect (or no difference) between the groups. That lack of difference is called the null hypothesis.</p></blockquote>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>It’s like if you were doing a trial of a drug that doesn’t work. In that trial, there just wouldn’t be a difference between the group that took the drug and the rest of the population. The difference would be null.</p></blockquote>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>You always assume that the null hypothesis is true until you have evidence that it isn’t.</p></blockquote>



<h4 class="wp-block-heading" id="6b40">Let’s keep&nbsp;moving!</h4>



<p class="wp-block-paragraph">We need to figure out which columns we want to keep and which we want to toss. If you just chuck a bunch of stuff into your model, it won’t be a good one. It definitely won’t be reliable! (Also, at the end of the day, you need to be able to explain your model to the guy who hired you to create this thing. You’re only going to want to explain the variables that actually predict something!)</p>



<p class="wp-block-paragraph">There are essentially five methods of building a multiple linear regression model.</p>



<ol class="wp-block-list"><li>Chuck Everything In and Hope for the Best</li><li>Backward Elimination</li><li>Forward Selection</li><li>Bidirectional Elimination</li><li>Score Comparison</li></ol>



<p class="wp-block-paragraph">You’ll almost certainly hear about Stepwise Regression as well. Stepwise regression is most commonly used as another way of saying bidirectional elimination (method 4). Sometimes when people use that phrase they’re referring to a combination of methods 2, 3, and 4. (That’s the idea behind bidirectional elimination as well.)</p>



<p class="wp-block-paragraph"><strong>Method 1</strong>&nbsp;(Chuck Everything In): Okay. That isn’t the official name for this method (but it should be). Occasionally you’ll need to build a model where you just throw in all your variables. You might have some kind of prior knowledge. You might have a particular framework you need to use. You might have been hired by someone who’s insisting that you do that. You might want to prepare for backward elimination. It’s a real option, so I’m including it here.</p>



<p class="wp-block-paragraph"><strong>Method 2</strong>&nbsp;(backward elimination): This has a few basic steps.</p>



<ol class="wp-block-list"><li>First, you’ll need to set a significance level for which data will stay in the model. For example, you might want to set a significance level of 5% (SL = 0.05). This is important and can have real ramifications, so give it some thought.</li><li>Next, you’ll fit the full model with all possible predictors.</li><li>You’ll consider the predictor with the highest P-value. If your P-value is greater than your significance level, you’ll move to step four, otherwise, you’re done!</li><li>Remove that predictor with the highest P-value.</li><li>Fit the model without that predictor variable. If you just remove the variable, you need to refit and rebuild the model. The coefficients and constants will be different. When you remove one, it affects the others.</li><li>Go back to step 3, do it all over, and keep doing that until you come to a point where even the highest P-value is &lt; SL. Now your model is ready. All of the variables that are left are less than the significance level.</li></ol>



<p class="wp-block-paragraph">(After we go through these concepts, I’ll walk you through an example of backward elimination so you can see it in action! It’s definitely confusing, but if you really look at what’s going on, you’ll get the hang of it.)</p>



<p class="wp-block-paragraph"><strong>Method 3</strong>&nbsp;(forward selection): This is way more complex than just reversing backward elimination.</p>



<ol class="wp-block-list"><li>Choose your significance level (SL = 0.05).</li><li>Fit all possible simple regression models and select the one with the lowest P-value.</li><li>Keep this variable and fit all possible models with one extra predictor added to the one you already have. If we selected a simple linear regressor with one variable, now we’d select all of them with two variables. That means all possible two variable linear regressions.</li><li>Find the predictor with the lowest P-value. If P &lt; Sl, go back to step 3. Otherwise, you’re done!</li></ol>



<p class="wp-block-paragraph">We can stop when P&lt;SL is no longer true, or there are no more P-values that are less than the significance level. It means that the variable is not significant anymore.&nbsp;<strong>You won’t keep the current model,</strong>&nbsp;<strong>though</strong>. You’ll keep the previous one because, in the final model, your variable is insignificant.</p>



<p class="wp-block-paragraph"><strong>Method 3</strong>&nbsp;(bidirectional elimination): This method combines the previous two!</p>



<ol class="wp-block-list"><li>Select a significance level to enter and a significance level to stay (SLENTER = 0.05, SLSTAY = 0.05).</li><li>Perform the next step of forward selection where you add the new variable. You need to have your P-value be less than SLENTER.</li><li>Now perform all of the steps of backward elimination. The variables must have a P-value less than SLSTAY in order to stay.</li><li>Now head back to step two, then move forward to step 3, and so on until no new variables can enter and no new variables can exit.</li></ol>



<p class="wp-block-paragraph">You’re done!</p>



<p class="wp-block-paragraph"><strong>Method 4</strong>&nbsp;(score comparison): Here, you’re going to be looking at all possible methods. You’ll look at a comparison of the scores for all of the possible methods. This is definitely the most resource-consuming approach!</p>



<ol class="wp-block-list"><li>Select a criterion of goodness of fit (for example,&nbsp;<a href="https://en.wikipedia.org/wiki/Akaike_information_criterion" rel="noreferrer noopener" target="_blank">Akaike criterion</a>)</li><li>Construct all possible regression models</li><li>Select the one with the best criterion</li></ol>



<p class="wp-block-paragraph">Fun fact: if you have 10 columns of data, you’ll wind up with 1,023 models here. You’d better be ready to commit if you’re going to go this route!</p>



<h4 class="wp-block-heading" id="0a4f">Ummm, what?</h4>



<p class="wp-block-paragraph">If you’re just getting started with machine learning, statistics, or data science, that all looks like it will be an insane amount of code. It’s not!</p>



<p class="wp-block-paragraph">So much of what you need to do with a machine learning model is all ready to go with the amazing libraries out there. You’ll need to do the tough parts where you decide what information is important and what kind of models you’ll want to use. It’s also up to you to interpret the results and be able to communicate what you’ve built. However, the code itself is very doable.GIF via&nbsp;<a href="https://giphy.com/gifs/missy-elliott-timbaland-get-your-freak-on-SZtVAR4ZcZisU" rel="noreferrer noopener" target="_blank">GIPHY</a></p>



<h4 class="wp-block-heading" id="7d35">Let me show&nbsp;you!</h4>



<p class="wp-block-paragraph">Backward elimination is the fastest and the best method to start with, so that’s what I’m going to walk you through after we build the quick and easy multiple linear regression model.</p>



<p class="wp-block-paragraph">First, let’s prepare our dataset. Let’s say we have a&nbsp;.csv file called “startups.csv” that contains the information we talked about earlier. We’ll say it has 50 companies and columns for R&amp;D spending, admin spending, marketing spending, what state the company is located in (let’s say, New York, Minnesota, and California), and one column for last year’s profit.</p>



<p class="wp-block-paragraph">It’s a good idea to import your libraries right away.</p>



<pre class="wp-block-preformatted"># Importing the libraries<br>import numpy as np<br>import matplotlib.pyplot as plt<br>import pandas as pd</pre>



<p class="wp-block-paragraph">Now we can go ahead and copy and paste the code from t<a href="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d" target="_blank" rel="noreferrer noopener">hat data cleaning and preparation article</a>! We’re definitely going to want to change the name of our dataset to ours. I’m calling it ‘startups.csv.’ We’ll adjust a couple of other tiny details as well. Profit (y) is still our last column, so we’ll continue to remove that with [:,&nbsp;:-1]. We’ll make a little adjustment to grab our independent variables with [:, 4]. Now we have a vector of the dependent variable (y) and a matrix of independent variables that contains everything except the profits (X). We want to see if there is a linear dependency between the two!</p>



<pre class="wp-block-preformatted">dataset = pd.read_csv('startups.csv')<br>X = dataset.iloc[:, :-1].values<br>y = dataset.iloc[:, 4].values</pre>



<p class="wp-block-paragraph">Now we need to encode the categorical variable. We can use label encoder and one hot encoder to create dummy variables. (We can copy and paste this from that other article too! Make sure you’re grabbing the right information and you don’t encode the dependent variable.) You’re going to change the index of the column in both spots [:, 3] and [:, 3] again, and replace the index in one hot encoder too [3].</p>



<pre class="wp-block-preformatted">from sklearn.preprocessing import LabelEncoder, OneHotEncoder<br>labelencoder = LabelEncoder()<br>X[:, 3] = labelencoder.fit_transform(X[:, 3])<br>onehotencoder = OneHotEncoder(categorical_features = [3])<br>X = onehotencoder.fit_transform(X).toarray()</pre>



<p class="wp-block-paragraph">You’re ready to go! Our one column of information is now three columns, each of which corresponds to one state!</p>



<p class="wp-block-paragraph">What about avoiding the dummy variable trap? You don’t actually need to do that with our libraries! It’s all taken care of for you here with the libraries that we’re choosing to use. However, if you ever want or need to run that code, it’s simple! You can do that with one line right after you encode your data.</p>



<pre class="wp-block-preformatted">X=X[:, 1:]</pre>



<p class="wp-block-paragraph">What does that do? It removes the first column from X. Putting the&nbsp;<strong>1</strong>&nbsp;there means that we want to take all of the columns starting at index 1 to the end. You won’t take the first column. For some libraries, you’ll need to take one column away manually to be sure your dataset won’t contain redundancies.</p>



<p class="wp-block-paragraph">Now let’s split our training and testing data. The most common split is an 80/20 split, which means 80% of our data would go to training our model and 20% would go to testing it. Let’s do that here!</p>



<pre class="wp-block-preformatted">from sklearn.model_selection import train_test_split<br>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)</pre>



<h4 class="wp-block-heading" id="4f9f">What about feature&nbsp;scaling?</h4>



<p class="wp-block-paragraph">We don’t need to do feature scaling here! The library will take care of that for us.</p>



<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*2IUiA_zJTg6VPrQz" alt=""/><figcaption>Photo by&nbsp;<a href="https://unsplash.com/@gift?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Gift Habeshaw</a>&nbsp;on&nbsp;<a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Unsplash</a></figcaption></figure>



<h4 class="wp-block-heading" id="22f7">Multiple linear regression time!</h4>



<p class="wp-block-paragraph">We’ll import linear regression from Scikit-Learn. (That makes a little sense, doesn’t it?)</p>



<pre class="wp-block-preformatted">from sklearn.linear_model import LinearRegression</pre>



<p class="wp-block-paragraph">Now we’ll introduce our regressor. We’ll create an object of the class LinearRegression and we’ll fit the object to our training set. We want to apply this to both our X_train and y_train.</p>



<pre class="wp-block-preformatted">regressor = LinearRegression()<br>regressor.fit(X_train, y_train)</pre>



<p class="wp-block-paragraph">Now let’s test the performance of our multiple linear regressor!</p>



<p class="wp-block-paragraph">(We won’t plot a graph here because we’d need five dimensions to do that. If you’re interested in plotting a graph with a simple linear regressor, check out&nbsp;<a href="https://towardsdatascience.com/simple-linear-regression-in-four-lines-of-code-d690fe4dba84" target="_blank" rel="noreferrer noopener">this article on building a simple linear regressor</a>.)</p>



<p class="wp-block-paragraph">We’ll create the vector of predictions (y_pred). We can use the regressor with the predict method to predict the observations of the test set (X_test).</p>



<pre class="wp-block-preformatted">y_pred = regressor.predict(X_test)</pre>



<p class="wp-block-paragraph">That’s it! Four lines of code and you’ve built a multiple linear regressor!GIF via&nbsp;<a href="https://giphy.com/gifs/outkast-gWtupEAZizyFy" rel="noreferrer noopener" target="_blank">GIPHY</a></p>



<p class="wp-block-paragraph">Now we can see the ten predicted profits! You can print them any time with a simple&nbsp;<code>print(y_pred)</code>. We can easily compare them by taking a look at the predictions and then comparing them to the actual results. If you were to take a look, you’d see that some are incredibly accurate and the rest are pretty darn good. Nice work!</p>



<p class="wp-block-paragraph">There is definitely some linear dependency between our dependent and independent variables. We can clearly see a strong linear relationship between the two.</p>



<p class="wp-block-paragraph">Congratulations!! You now know how to make a multiple linear regressor in Python!</p>



<h4 class="wp-block-heading" id="ea78">Want to keep&nbsp;going?</h4>



<p class="wp-block-paragraph">Things are about to get more challenging!</p>



<p class="wp-block-paragraph">What if some of the variables have a lot of impact on our dependent variable and some are statistically insignificant? We can definitely find out which are the variables that have the highest impact on the dependent variable. We’ll want to find a team of variables that all have a definite effect, positive or negative.</p>



<p class="wp-block-paragraph">Let’s use&nbsp;<strong>backward elimination</strong>!</p>



<p class="wp-block-paragraph">We need to prepare something specific for backward elimination. We want a library stats model, so let’s import statsmodels.formula.api. That’s a little long to have to keep retyping, so we’ll make a shortcut using sm.</p>



<pre class="wp-block-preformatted">import statsmodels.formula.api as sm</pre>



<p class="wp-block-paragraph">We need to add a column of ones in our matrix of features of independent variables because of the way it works with the constant. (Our model needs to take into account our constant b0. In most libraries it’s included, but not in the stats model that we’re using. We’ll add a column of ones so our stats model will understand the formula correctly.)</p>



<p class="wp-block-paragraph">This starts pretty simply. We’ll use&nbsp;.append because we want to append.</p>



<p class="wp-block-paragraph">(Love Python ❤️)</p>



<p class="wp-block-paragraph">We have our matrix of features X. The values argument is perfect for us because it’s an array. We’ll input a matrix of 50 lines and one column with 1s inside. We can create that with Numpy’s np.ones. We’ll need to specify the numbers of lines and columns we want (50,1). We need to convert the array into the integer type to make this work, so we’ll use&nbsp;.astype(int). Then we need to decide if we’re adding a line or a column (line = 0, column = 1), so we’ll say axis = 1 for a column!</p>



<p class="wp-block-paragraph">We want this column to be located at the beginning of our dataset. What do we do? Let’s add matrix X to the column of 50 ones, rather than the other way around. We can do that with values = X.</p>



<pre class="wp-block-preformatted">X = np.append(arr = np.ones((50, 1)).astype(int), values = X, axis = 1)</pre>



<h4 class="wp-block-heading" id="9574">Let’s do&nbsp;this!</h4>



<p class="wp-block-paragraph">We want to create a new matrix of our optimal features (X_opt). These features are the ones that are statistically significant. The ones that have a high impact on the profit. This will be the matrix containing the team of optimal features with high impact on the profit.</p>



<p class="wp-block-paragraph">We’ll need to initialize it. We can remove the variables that are not statistically significant one by one. We’ll do this by removing the index at each step. First take all the indexes of the columns in X, separated by commas [0,1,2,3,4,5].</p>



<p class="wp-block-paragraph">If you look back at the methods earlier, you’ll see that we first need to select our significance level, which we talked about earlier. Then we need to fit the model!</p>



<p class="wp-block-paragraph">We aren’t going to take the regressor we built. We’re using a new library, so now we need a new fit to our future optimal matrix. We’ll create a new regressor (our last one was from the linear regression library). Our new class will be ordinary least squares (OLS). We’ll need to call the class and specify some arguments. (You can check out&nbsp;<a href="https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.html" rel="noreferrer noopener" target="_blank">the official documentation here</a>.) For our arguments, we’ll need an endog (our dependent variable) and an exog (our X_opt, which is just our matrix of features (X) with the intercept, which isn’t included by default). In order to fit it we’ll just use a&nbsp;.fit()!</p>



<pre class="wp-block-preformatted">X_opt = X[:, [0, 1, 2, 3, 4, 5]]<br>regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()</pre>



<p class="wp-block-paragraph">Now we’ve initialized X_opt!</p>



<p class="wp-block-paragraph">Now let’s look at our P-values! How do we look for the predictor with the highest P-values? We’ll take our regressor object and call the function&nbsp;.summary().</p>



<pre class="wp-block-preformatted">regressor_OLS.summary()</pre>



<p class="wp-block-paragraph">Now we can see a table with some very useful information about our model! We can see the adjusted R-squared values and our P-values. The lower the p-value, the more significant your independent variable will be with respect to your dependent variable. Here, we’re looking for the highest one. That’s easy to see.</p>



<figure class="wp-block-image"><a href="https://www.linkedin.com/in/annebonnerdata/" target="_blank" rel="noreferrer noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/ce187-1mger_5ny__edc9uwgyw8ow.png?w=1080&#038;ssl=1" alt=""/></a></figure>



<p class="wp-block-paragraph">Now let’s remove it!</p>



<p class="wp-block-paragraph">We can copy and paste our code from above and remove index 2. That will look like this:</p>



<pre class="wp-block-preformatted">X_opt = X[:, [0, 1, 3, 4, 5]]<br>regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()<br>regressor_OLS.summary()</pre>



<p class="wp-block-paragraph">Just keep going until you don’t have any P-values that are higher than the SL value you chose. Remember that you always want to look at the original matrix in order to choose the correct index! You’re using the columns in your original matrix (X), not in X_opt.</p>



<p class="wp-block-paragraph">You might get to the point where you have a P-value that’s incredibly close to the SL value that you chose. For example, we chose 0.050 and here’s 0.060.</p>



<figure class="wp-block-image"><a href="https://www.linkedin.com/in/annebonnerdata/" target="_blank" rel="noreferrer noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/4bef2-1nafku5h-rvyeevyblosuqg.png?w=1080&#038;ssl=1" alt=""/></a><figcaption>GIF via&nbsp;<a href="https://giphy.com/gifs/television-rupauls-drag-race-glitter-8zcg4BB5NKQ5W" rel="noreferrer noopener" target="_blank">GIPHY</a></figcaption></figure>



<p class="wp-block-paragraph">That’s a tough situation because the value that you chose could have been anything. If you want to thoroughly follow your framework, you’ll need to remove that index. But there are other metrics that can help make more sense of whether or not we want to do that. We could add other metrics, like a criterion, that can help us decide if we really want to make that choice. There’s also a lot of information right in the summary here, like the R-squared value, that can help us make our decision.</p>



<figure class="wp-block-image"><a href="https://www.linkedin.com/in/annebonnerdata/" target="_blank" rel="noreferrer noopener"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/211e1-1dt0dhqsgg4nymzlt2i5sba.png?w=1080&#038;ssl=1" alt=""/></a></figure>



<p class="wp-block-paragraph">So let’s say we ran backward elimination until the end and we’re left with only the index for the R&amp;D spending column.</p>



<pre class="wp-block-preformatted">X_opt = X[:, [0, 1, 3, 4, 5]]<br>regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()<br>regressor_OLS.summary()<br>X_opt = X[:, [0, 1, 3, 5]]<br>regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()<br>regressor_OLS.summary()<br>X_opt = X[:, [0, 3, 5]]<br>regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()<br>regressor_OLS.summary()<br>X_opt = X[:, [0, 3]]<br>regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()<br>regressor_OLS.summary()</pre>



<p class="wp-block-paragraph">If we’ve been following our model carefully, that means that we now know that R&amp;D spending is a powerful predictor for our dependent variable! The conclusion here is that the data that can predict profits with the highest impact is composed of only one category: R&amp;D spending!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/d3ad3-1i_6nchm-0uu4jlsobpjhua.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You did it! You used multiple linear regression and backward elimination! You figured out that looking at R&amp;D spending will give you the best sense of what a company’s profits will be!</p>



<p class="wp-block-paragraph">You’re amazing!</p>



<p class="wp-block-paragraph">As always, if you’re doing anything cool with this information, let people know about it in the responses below or reach out any time on LinkedIn&nbsp;<a rel="noreferrer noopener" href="https://www.linkedin.com/in/annebonnerdata/" target="_blank">@annebonnerdata</a>!</p>



<div style="display:none;"><figure class="wp-block-image size-large"><img data-recalc-dims="1" decoding="async" width="683" height="1024" data-attachment-id="930" data-permalink="https://contentsimplicity.com/how-to-write-and-publish-articles-that-get-noticed/pin_cs_maiin2/" data-orig-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_maiin2.png?fit=735%2C1102&amp;ssl=1" data-orig-size="735,1102" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pin_cs_maiin2" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_maiin2.png?fit=200%2C300&amp;ssl=1" data-large-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_maiin2.png?fit=683%2C1024&amp;ssl=1" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_maiin2.png?resize=683%2C1024&#038;ssl=1" alt="" class="wp-image-930" srcset="https://contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_maiin2-683x1024.png 683w, https://contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_maiin2-480x720.png 480w" sizes="(min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) 683px, 100vw" /></figure></div>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/machine-learning-multiple-linear-regression/">Multiple Linear Regression in 4 Lines of Code!</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/machine-learning-multiple-linear-regression/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">422</post-id>	</item>
		<item>
		<title>Simple linear regression in four lines of code</title>
		<link>https://contentsimplicity.com/machine-learning-simple-linear-regression/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=machine-learning-simple-linear-regression&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=machine-learning-simple-linear-regression</link>
					<comments>https://contentsimplicity.com/machine-learning-simple-linear-regression/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Mon, 27 May 2019 21:14:31 +0000</pubDate>
				<category><![CDATA[data]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[linear regression]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[numpy]]></category>
		<category><![CDATA[pandas]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[scikit-learn]]></category>
		<category><![CDATA[statistics]]></category>
		<guid isPermaLink="false">http://contentsimplicity.com/?p=344</guid>

					<description><![CDATA[<p>Even you can build a machine learning model. (Yes, you!) Good data alone doesn’t always tell the whole story.</p>
<p>The post <a href="https://contentsimplicity.com/machine-learning-simple-linear-regression/">Simple linear regression in four lines of code</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h4 class="wp-block-heading">A clear and comprehensive blueprint for absolutely anyone who wants to build a simple machine learning model</h4>



<p class="wp-block-paragraph">(<em>This article first appeared in <a href="https://towardsdatascience.com/simple-linear-regression-in-four-lines-of-code-d690fe4dba84">Towards Data Science</a></em>)</p>



<p class="wp-block-paragraph">Even you can build a machine learning model.</p>



<p class="wp-block-paragraph">Seriously!</p>



<p class="wp-block-paragraph">Good data alone doesn’t always tell the whole story. Are you trying to figure out what someone’s salary should be based on their years of experience? Do you need to examine how much you’re spending on advertising in relation to your yearly sales?&nbsp;Linear regression&nbsp;might be exactly what you need!</p>



<h3 class="wp-block-heading" id="d2b0">What is linear regression?</h3>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Linear regression looks at the relationship between the data you have and the data you want to&nbsp;predict.</p></blockquote>



<p class="wp-block-paragraph">Linear Regression is a basic and commonly used type of predictive analysis. It’s the most widely used of all statistical techniques.&nbsp;It quantifies the relationship between one or more&nbsp;<strong>predictor variables</strong>&nbsp;and one&nbsp;<strong>outcome variable</strong>.</p>



<p class="wp-block-paragraph"><strong>Linear regression models</strong>&nbsp;are used to show (or predict) the relationship between two variables or factors.&nbsp;<strong>Regression analysis</strong>&nbsp;is commonly used to show the&nbsp;correlation&nbsp;between two variables.</p>



<p class="wp-block-paragraph">You could, for example, look at some information about players on a baseball team and predict how well they might do that season. You might want to examine some variables about a company and predict well their stock might do. You might even just want to examine the number of hours people study and how well they do on a test, or you could look at student’s homework grades overall in relation to how well they might do on their tests. It’s a seriously useful technique!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/a964d-1spnz1ctvozf5ex4fdv6icq.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by StockSnap via&nbsp;<a href="https://pixabay.com/photos/baseball-bat-athlete-sports-2617310/" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Just remember:&nbsp;<strong>correlation is not causation</strong>!&nbsp;Just because a relationship exists between two variables doesn’t mean that one variable caused the other variable!Regression analysis is not used to predict cause-and-effect relationships. It can look at how variables relate to each other. It can examine to what extent variables are associated with each other.&nbsp;It’s up to you to take a closer look at those relationships.</p></blockquote>



<h3 class="wp-block-heading" id="7bdc">A couple of important terms:</h3>



<p class="wp-block-paragraph">The variable that the equation in your linear regression model is predicting is called the&nbsp;<strong>dependent variable</strong>.&nbsp;We call that one&nbsp;<strong>y</strong>. The variables that are being used to predict the dependent variable are called the&nbsp;<strong>independent variables.</strong>&nbsp;We call them&nbsp;<strong>X</strong>.</p>



<p class="wp-block-paragraph">You can think of it as though the prediction (<strong>y</strong>) is dependent on the other variables (<strong>X</strong>). That makes&nbsp;<strong>y</strong>&nbsp;the dependent variable!</p>



<p class="wp-block-paragraph">In&nbsp;<strong>simple linear regression analysis</strong>, each observation consists of two variables. These are the independent variable and the dependent variable.&nbsp;<strong>Multiple regression analysis</strong>&nbsp;looks at two or more independent variables and how they correlate to the independent variable.&nbsp;The equation that describes how&nbsp;<strong>y</strong>&nbsp;is related to&nbsp;<strong>X</strong>&nbsp;is called the&nbsp;<strong>regression model</strong>!</p>



<p class="wp-block-paragraph">Regression was first studied in depth by&nbsp;<a href="https://en.wikipedia.org/wiki/Francis_Galton" rel="noreferrer noopener" target="_blank">Sir Francis Galton</a>, a man with a wide variety of interests. While he was a very problematic character with a lot of beliefs worth disagreeing with, he did write some books with cool information about things like treating&nbsp;spear&nbsp;wounds and getting your horse unstuck from quicksand. He also did some useful work with fingerprints, hearing tests, and even devised the first weather map. He was&nbsp;knighted&nbsp;in 1909.</p>



<p class="wp-block-paragraph">While studying data on the relative sizes between parents and their children in plants and animals, he&nbsp;observed that larger-than-average parents have larger-than-average children, but those children will be less large in terms of their relative position within their own generation. He called it&nbsp;<strong>regression towards mediocrity.</strong> That would be&nbsp;<strong>regression to the mean</strong>&nbsp;in modern terms.</p>



<p class="wp-block-paragraph">(I have to say, though, that there is a certain sparkle to the phrase, “regression towards mediocrity” that I need to work into my day-to-day life&#8230;)</p>



<p class="wp-block-paragraph">To be clear, though, we’re talking about&nbsp;<strong>expectations</strong>&nbsp;(predictions) and not absolute certainty!</p>



<h3 class="wp-block-heading" id="9933">What good are regression models?</h3>



<p class="wp-block-paragraph">Regression models are used for predicting a real value, for example, salary or height.&nbsp;If your independent variable is&nbsp;<strong>time</strong>, then you are forecasting future values.&nbsp;Otherwise, your model is predicting present but unknown values. Examples of regression techniques include:</p>



<ul class="wp-block-list"><li>Simple regression</li><li>Multiple regression</li><li>Polynomial regression</li><li>Support Vector Regression</li></ul>



<p class="wp-block-paragraph">Let’s say you’re looking at some data that includes employee’s years of experience and salary. You want to look at the correlation between those two figures. Maybe you’re running a new business or small company that has been kind of setting the numbers randomly.</p>



<p class="wp-block-paragraph">So how can you find the correlation between those two variables? In order to figure that out, we’ll create a model that will tell us what is the best fitting line for this relationship.</p>



<h4 class="wp-block-heading" id="5acd">Intuition</h4>



<p class="wp-block-paragraph">Here’s a simple linear regression formula:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/18d7e-1eieyrsqib85cpa32zapqwq.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">(You might recognize this as the equation for a&nbsp;slope&nbsp;or trend line from high school algebra.)</p>



<p class="wp-block-paragraph">In this equation,&nbsp;<strong>y</strong>&nbsp;is the dependent variable, which is what you’re trying to explain. For the rest of this article,&nbsp;<strong>y</strong>&nbsp;will be an employee’s salary after a certain number of years of experience.</p>



<p class="wp-block-paragraph">You can see the independent variable above. That’s the variable that is associated with the change in your predicted values.&nbsp;The independent variable might be causing the change or simply associated with the change. Remember,&nbsp;<strong>linear regression doesn’t prove&nbsp;</strong><strong>causation</strong>!</p>



<p class="wp-block-paragraph">The coefficient is how you explain that a change in your independent variable is maybe not totally equal to a change in y.</p>



<p class="wp-block-paragraph">Now we want to look at the evidence. We want to put a line through our data that best fits our data. A regression line can show a positive linear relationship (the line looks like it’s sloping up), a negative linear relationship (the line is sloping down), or really no relationship at all (a flat line).</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/7775b-1uc_kzaudsalw8mq41oi9hg.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/e7f89-12hdi9onqyfvksqjswn8meq.png?w=1080&#038;ssl=1" alt=""/></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/fdc3a-13crqiliib2ki8s3uzoor_a.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">The constant is the point where the line crosses the vertical axis. For example, if you looked at 0 years of experience in the graph below, your salary would be around $30,000. So the constant in the chart below would be about $30,000.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/70aaa-1n0ess5sefzxmzhau6t2uww.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">The&nbsp;steeper&nbsp;the slope, the more money you get for your years of experience. For example, maybe with 1 more year of experience, your salary (y) goes up an additional $10,000, but with a steeper slope, you might wind up with more like $15,000. With a negative slope, you’d actually lose money as you gained experience, but I really hope you won’t be working for that company for long&#8230;</p>



<h4 class="wp-block-heading" id="690c">How does simple linear regression find that&nbsp;line?</h4>



<p class="wp-block-paragraph">When we look at a graph, we can draw vertical lines from the line to our actual observations.&nbsp;You can see the actual observations as the dots, while the line displays the model observations (the predictions).</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/6e761-1uc_kzaudsalw8mq41oi9hg.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">The line that we drew is the difference between what an employee is actually earning and what he’s modeled (predicted) to be earning.&nbsp;We would look at the&nbsp;<strong>minimum sum of squares&nbsp;</strong>to find the best line, which just means that you’d take the sum of all the&nbsp;squared differences and find&nbsp;the minimum.</p>



<p class="wp-block-paragraph">That’s called the&nbsp;<strong>ordinary least squares</strong>&nbsp;method!</p>



<h3 class="wp-block-heading" id="dd4e">So how do we do&nbsp;that?</h3>



<p class="wp-block-paragraph">First the imports!</p>



<pre class="wp-block-preformatted">import numpy as np<br>import matplotlib.pyplot as plt<br>import pandas as pd</pre>



<p class="wp-block-paragraph">Now let’s preprocess our data! If you don’t know much about data cleaning and preprocessing, you might want to check out&nbsp;<a rel="noreferrer noopener" href="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d" target="_blank">this article</a>. It will walk you through importing libraries, preparing your data, and feature scaling. </p>



<p class="wp-block-paragraph">We’re going to copy and paste the code from that article and make two tiny changes. We’ll need to change the name of our dataset, of course. Then we’ll take a look at the data. For our example, let’s say for our employees we have one column of years of experience and one column of salaries and that’s it. Keeping in mind that our index starts at 0, we will go ahead and separate the last column from our data for the dependent variable, just like we already have set up. This time, however, we’d be grabbing the second column for our independent variable, so we’d make a minor change to grab that.</p>



<pre class="wp-block-preformatted">dataset = pd.read_csv('salary.csv')<br>X = dataset.iloc[:, :-1].values<br>y = dataset.iloc[:, 1].values</pre>



<p class="wp-block-paragraph">Now X is a matrix of features (our independent variable) and y is a vector of the dependent variable. Perfect!</p>



<p class="wp-block-paragraph">It’s time to split our data into a training set and a test set. Normally, we would do an 80/20 split for our training and testing data. Here, though, we’re working with a small dataset of only 30 observations. Maybe this time we’ll split up our data so that we have 20 training observations and a test size of 10.</p>



<pre class="wp-block-preformatted">from sklearn.model_selection import train_test_split<br>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)</pre>



<p class="wp-block-paragraph">You have an X_train, X_test, y_train, and y_test! You’re ready to go! (Never forget that there are about a million things to learn about, change, and improve at every step of this process. The power of your model depends on you and everything that you put into it!)</p>



<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*OWGSTULrKNbDvT9P" alt=""/><figcaption>Photo by&nbsp;<a href="https://unsplash.com/@thomasw?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Thomas William</a>&nbsp;on&nbsp;<a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Unsplash</a></figcaption></figure>



<p class="wp-block-paragraph">We set a random state of 0 so that we can all get the same result. (There can be random factors in calculations, and I want to make sure we’re all on the same page so that nobody gets nervous.)</p>



<p class="wp-block-paragraph">We’ll train our model on the training set and then later predict the results based on our information. Our model will&nbsp;<strong>learn</strong>&nbsp;the correlations on the training set. Then we will test what it learned by having it predict values with our test set. We can compare our results with the actual results on the test set to see how our model is doing!</p>



<p class="wp-block-paragraph">A<strong>lways split your data into training and testing sets</strong>!&nbsp;If you test your results on the same data you used to train it, you’ll probably have really great results, but your model isn’t good! It just memorized what you wanted it to do, rather than learning anything that it can use with unknown data. That’s called overfitting, and it means that you&nbsp;<strong>did not build a good model</strong>!</p>



<h3 class="wp-block-heading" id="e741">Feature scaling</h3>



<p class="wp-block-paragraph">We actually don’t need to do any feature scaling here!</p>



<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*2IUiA_zJTg6VPrQz" alt=""/><figcaption>Photo by&nbsp;<a href="https://unsplash.com/@gift?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Gift Habeshaw</a>&nbsp;on&nbsp;<a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Unsplash</a></figcaption></figure>



<h3 class="wp-block-heading" id="7287">Linear regression</h3>



<p class="wp-block-paragraph">Now we can fit the model to our training set!</p>



<p class="wp-block-paragraph">We’ll use&nbsp;<a href="https://scikit-learn.org/stable/index.html" rel="noreferrer noopener" target="_blank">Scikit-learn</a>&nbsp;learn for this. First, we’ll import the linear model library and the linear regression class. Then we’ll create an object of the class — the regressor. We’ll use a&nbsp;method (the fit method)&nbsp;to fit the regressor object that we create to the training set. To create the object, we name it, then call it using the parenthesis. We can do all of that in about three lines of code!</p>



<p class="wp-block-paragraph">Let’s import linear regression from Scikit-Learn so that we can go ahead and use it. Between the parenthesis, we’ll specify which data we want to use so our model knows exactly what we want to fit. We want to grab both X_train and y_train because we’re working with all of our training data.</p>



<p class="wp-block-paragraph">You can look at the&nbsp;<a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html" rel="noreferrer noopener" target="_blank">documentation</a>&nbsp;if you want more details!</p>



<p class="wp-block-paragraph">Now we’re ready to create our regressor and fit it to our training data.</p>



<pre class="wp-block-preformatted">from sklearn.linear_model import LinearRegression<br>regressor = LinearRegression()<br>regressor.fit(X_train, y_train)</pre>



<p class="wp-block-paragraph">There it is! We’re using simple linear regression on our data and we’re ready to try out our predictive ability on our test set!</p>



<p class="wp-block-paragraph">This is machine learning! We created a machine, the regressor, and we had it learn the correlation between years of experience and salary on the training set.</p>



<p class="wp-block-paragraph">Now it can predict future data based on the information that it has. Our machine is ready to predict a new employee’s salary based on the number of years of experience that the employee has!</p>



<p class="wp-block-paragraph">Let’s use our regressor to predict new observations. We want to see how the machine has learned by looking at what it does with new observations.</p>



<p class="wp-block-paragraph">We’ll create a vector of predicted values. This is a vector of predictions of dependent variables that we’ll call y_pred. To do this, we can take the regressor we created and trained and use the predict method. We need to specify which predictions to make, so we want to make sure we include the test set. For our input parameter in regressor.predict, we want to specify the matrix of features of new observations, so we’ll specify X_test.</p>



<pre class="wp-block-preformatted">y_pred = regressor.predict(X_test)</pre>



<p class="wp-block-paragraph">Seriously. That takes a single line of code!</p>



<p class="wp-block-paragraph">Now y_test are the real salaries of the 10 observations in the test set and y_pred are the predicted salaries of these 10 employees predicted by our model.</p>



<p class="wp-block-paragraph">You did it! Linear regression in four lines of code!</p>



<h3 class="wp-block-heading" id="1e48">Visualization</h3>



<p class="wp-block-paragraph">Let’s visualize the results! We need to see what the difference is between our predictions and the actual results.</p>



<p class="wp-block-paragraph">We can plot the graphs in order to interpret the result. First, we can plot the real observations using plt.scatter to make a scatter plot. (We imported matplotlib.pyplot earlier as plt).</p>



<p class="wp-block-paragraph">We’ll look at the training set first, so we’ll plot X_train on the X coordinates and y_train on y coordinates. Then we probably want some color. We’ll do our observations in blue, and our regression line (predictions) in red. For the regression line we’ll use X_train again for the X coordinates, and then the predictions of the X_train observations.</p>



<p class="wp-block-paragraph">Let’s also fancy it up a little with a title and labels for the x-axis and y-axis.</p>



<pre class="wp-block-preformatted">plt.scatter(X_train, y_train, color = 'blue')<br>plt.plot(X_train, regressor.predict(X_train), color = 'red')<br>plt.title('Salary vs Experience (Training set)')<br>plt.xlabel('Years of Experience')<br>plt.ylabel('Salary')<br>plt.show()</pre>



<p class="wp-block-paragraph">Now we can see our blue points, which are our real values and our predicted values along the red line!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/265c4-1kyfuuplv9vbzibebiuemgw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Let’s do the same for the test set! We’ll change the test set title and change our “train” to “test” in the code.</p>



<pre class="wp-block-preformatted">plt.scatter(X_test, y_test, color = 'blue')<br>plt.plot(X_train, regressor.predict(X_train), color = 'red')<br>plt.title('Salary vs Experience (Test set)')<br>plt.xlabel('Years of Experience')<br>plt.ylabel('Salary')<br>plt.show()</pre>



<p class="wp-block-paragraph">Make sure you notice that we aren’t changing X_train to X_test in the second line. Our regressor is already trained by the training set. When we trained, we obtained one unique model equation. If we replace it, we’ll obtain the same line and we’ll probably build new points of the same regression line.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/70aaa-1n0ess5sefzxmzhau6t2uww.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">This is a pretty good model!</p>



<p class="wp-block-paragraph">Our model is doing a nice job of predicting these new employee salaries. Some of the actual observations are the same as the predictions, which is great. There isn’t a 100% dependency between the&nbsp;<strong>y&nbsp;</strong>and&nbsp;<strong>X</strong>&nbsp;variables, so some of the predictions won’t be completely accurate.</p>



<p class="wp-block-paragraph">You did it! You imported libraries, cleaned and preprocessed data, built and trained a simple linear regressor, used it to make predictions, and you even visualized the results!</p>



<p class="wp-block-paragraph">Congratulations!!!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/e0c4a-1f1dc-3ebwt12ebccd6zorw.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by Free-Photos via&nbsp;<a href="https://pixabay.com/photos/girls-sparklers-fireworks-984154/" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<h4 class="wp-block-heading" id="0be0">Want more?</h4>



<p class="wp-block-paragraph"><a rel="noreferrer noopener" href="https://towardsdatascience.com/multiple-linear-regression-in-four-lines-of-code-b8ba26192e84" target="_blank">Multiple linear regression</a>&nbsp;is up next! </p>



<h4 class="wp-block-heading" id="6553">Keep learning!</h4>



<p class="wp-block-paragraph">Machine learning is built on statistics and you can’t begin to understand machine learning without concepts like the simple linear regressor.&nbsp;But that doesn’t mean that statistics and machine learning are the same things! A linear regressor is very much a tool of statistics (and data science), in addition to being a part of the basic building blocks of machine learning.</p>



<p class="wp-block-paragraph">As always, if you’re doing anything cool with this information, let people know about it in the comments below or reach out any time on LinkedIn&nbsp;<a rel="noreferrer noopener" href="https://www.linkedin.com/in/annebonnerdata/" target="_blank">@annebonnerdata</a>!</p>



<div style="display:none;"><figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="683" height="1024" data-attachment-id="931" data-permalink="https://contentsimplicity.com/how-to-write-and-publish-articles-that-get-noticed/pin_cs_main3/" data-orig-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main3.png?fit=735%2C1102&amp;ssl=1" data-orig-size="735,1102" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pin_cs_main3" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main3.png?fit=200%2C300&amp;ssl=1" data-large-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main3.png?fit=683%2C1024&amp;ssl=1" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main3.png?resize=683%2C1024&#038;ssl=1" alt="" class="wp-image-931" srcset="https://contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main3-683x1024.png 683w, https://contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main3-480x720.png 480w" sizes="auto, (min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) 683px, 100vw" /></figure></div>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/machine-learning-simple-linear-regression/">Simple linear regression in four lines of code</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/machine-learning-simple-linear-regression/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">344</post-id>	</item>
		<item>
		<title>How to Create a Totally Free Portfolio or Website</title>
		<link>https://contentsimplicity.com/how-to-create-a-free-portfolio/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-create-a-free-portfolio&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-create-a-free-portfolio</link>
					<comments>https://contentsimplicity.com/how-to-create-a-free-portfolio/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Sun, 26 May 2019 21:03:29 +0000</pubDate>
				<category><![CDATA[github]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[blog]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[free]]></category>
		<category><![CDATA[GitHub Pages]]></category>
		<category><![CDATA[portfolio]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[webdev]]></category>
		<category><![CDATA[website]]></category>
		<guid isPermaLink="false">http://contentsimplicity.com/?p=319</guid>

					<description><![CDATA[<p>Do you want to create a totally free portfolio, blog, or website but you don’t know where to start? It's simple and painless if you know what to do!</p>
<p>The post <a href="https://contentsimplicity.com/how-to-create-a-free-portfolio/">How to Create a Totally Free Portfolio or Website</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h4 class="wp-block-heading" id="3d10">Getting started with GitHub Pages: the unbelievably quick and easy guide for creating and publishing a free portfolio, blog, or&nbsp;website</h4>



<p class="wp-block-paragraph"><em>(This article first appeared on <a rel="noreferrer noopener" aria-label="Towards Data Science (opens in a new tab)" href="https://towardsdatascience.com/how-to-create-a-free-github-pages-website-53743d7524e1" target="_blank">Towards Data Science</a></em>)</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>GitHub Pages has to be the coolest tool that people don’t know that they already have. Pretty much any repository on GitHub can be&nbsp;turned into&nbsp;a website with the click of two&nbsp;buttons. It&#8217;s the simplest way to build and host a totally free portfolio, website, or blog.</p></blockquote>



<p class="wp-block-paragraph">Do you need an online portfolio of your work for potential employers to&nbsp;check out&nbsp;but you don’t know how to make a website? Do you want to create a free portfolio, blog, or a business site but you don’t know where to start? Is it possible that you just don’t want to deal with (or pay for) website hosting, domain names, and everything else?</p>



<p class="wp-block-paragraph">This one’s for you!</p>



<figure class="wp-block-image"><img decoding="async" src="https://cdn-images-1.medium.com/max/1600/0*0PnYdtyWLAHf1xtu" alt=""/><figcaption>Photo by&nbsp;<a href="https://unsplash.com/@naptimedoe?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Leonard Alcira</a>&nbsp;on&nbsp;<a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" rel="noreferrer noopener" target="_blank">Unsplash</a></figcaption></figure>



<h4 class="wp-block-heading" id="d200">Why should I have a&nbsp;website?</h4>



<p class="wp-block-paragraph">It’s hard to imagine anyone who wouldn’t benefit from having a website! You might need to display your portfolio for potential clients or employers. You might need to organize your projects in a way that you can share. You may want to create a blog about the things you’re doing or the places you’ve been. You might need to advertise&nbsp;yourself or your business or sell a product. Whatever your reason, there’s a good chance that you want to put something together without spending a ton of time on it. There’s an even better chance that you don’t want&nbsp;to spend a lot of money.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/ff2e7-1qv6eo6gfdgoemc54ugspig.jpeg?w=1080&#038;ssl=1" alt=""/><figcaption>Photo by imarksm via&nbsp;<a href="http://pixabay.com/" rel="noreferrer noopener" target="_blank">Pixabay</a></figcaption></figure>



<p class="wp-block-paragraph">A website is a way to connect with the world and it’s a powerful tool for communication. It’s a way to share your work, your interests, and your passions. It’s how you can create, build, and control your online image. Plus, the sooner you build your site, the more time you’ll have to build your online presence and reach the people that you want to reach. It can help you stand out in a sea of competitors.</p>



<p class="wp-block-paragraph">It’s also not the easiest thing to create if you don’t know what you’re doing!</p>



<p class="wp-block-paragraph">There are a few ways that a beginner can create a simple and completely free portfolio or website. The main ones are GitHub and WordPress.</p>



<p class="wp-block-paragraph">WordPress is great for beginners who need&nbsp;<em>all</em>&nbsp;the help. I love WordPress! That’s how I got started with my very first blog! The thing about a free WordPress site is that it’s really obvious that it’s a WordPress site. You have an address that ends in wordpress.com and a WordPress logo at the bottom of every page.</p>



<p class="wp-block-paragraph">If you’re getting started in the tech world, you’re going to find that you look more appealing if you know how to use GitHub. If you’ve been in the tech world for a little while now, there’s a very good chance that you already have somewhere between one and one million repositories on GitHub right now.</p>



<p class="wp-block-paragraph">Why not build your website on GitHub and host it right from your repository?</p>



<p class="wp-block-paragraph">So much of what matters in the tech world right now is contributing to open source. Sharing your work openly in the community is a big deal. GitHub is designed for exactly this. Putting your work on GitHub shows that you’re involved and aware. (<a href="https://bonn0062.github.io/anne_bonner/" rel="noreferrer noopener" target="_blank">I host my portfolio right from a repository on GitHub if you want to take a look</a>. It’s pretty out-of-date, but it is an example of a profile site built with Bootstrap and hosted from a GitHub repo.)</p>



<p class="wp-block-paragraph">When you share your projects on GitHub, people can see your code, what you’re doing, and how you’re doing it. GitHub is all about the communication of ideas.</p>



<p class="wp-block-paragraph">Pretty&nbsp;much everyone in tech right now is using Git and/or GitHub in some way. Having your profile right there on&nbsp;GitHub&nbsp;is a great way to hold up your hand and get involved. Plus, you’ll wind up with a&nbsp;repository&nbsp;and some commits on your profile page!</p>



<p class="wp-block-paragraph">If you’re brand new to everything Git, you might want to check out “<a href="https://towardsdatascience.com/getting-started-with-git-and-github-6fcd0f2d4ac6" rel="noreferrer noopener" target="_blank">Getting Started with Git and GitHub: the complete beginner’s guide</a>.” That article will walk you through the basics of what Git and GitHub are, concepts like “<strong>repositories</strong>,” and a ton more. I’m going to assume that you already know the basics. If you don’t, it’s worth taking a few minutes to get acquainted with them.</p>



<h4 class="wp-block-heading" id="c49e">Let’s get this party&nbsp;started!</h4>



<p class="wp-block-paragraph">There are two ways of getting started with your free portfolio or website. You might be starting completely from scratch! On the other hand, you might have a website that you’ve already put together, but you don’t know how to use GitHub to turn it into a free website.</p>



<p class="wp-block-paragraph">I’ll start with option 2.</p>



<h4 class="wp-block-heading" id="e4cc">I have the files, but I don’t know what to do with&nbsp;them!</h4>



<p class="wp-block-paragraph">This couldn’t be easier. Seriously! GitHub does the rest of the work for you. I’m assuming that you already have a GitHub account and that you know what a repository is, but if you don’t, check out&nbsp;<a href="https://towardsdatascience.com/getting-started-with-git-and-github-6fcd0f2d4ac6" rel="noreferrer noopener" target="_blank">that getting started with Git and GitHub article</a>.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>In a&nbsp;nutshell, a repository is where your project will live. It’s where you’ll organize your project. You’ll keep folders, files, images, videos, spreadsheets, Jupyter notebooks, data sets, and anything else your project needs in your repository.</p></blockquote>



<p class="wp-block-paragraph">If you haven’t already, go ahead and&nbsp;<a href="https://towardsdatascience.com/getting-started-with-git-and-github-6fcd0f2d4ac6" rel="noreferrer noopener" target="_blank">initialize your project with a repository, or create a new repository and upload your files</a>. If you have a file called “index.html” GitHub will already understand what you want to do.</p>



<p class="wp-block-paragraph">Now you’re going to take advantage of&nbsp;<a href="https://pages.github.com/" rel="noreferrer noopener" target="_blank">GitHub Pages</a>. Go to your GitHub repository and click “<strong>Settings</strong>.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/f599c-1urnptt5hxovct5qwinxura.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Scroll down to “<strong>GitHub pages</strong>.” You’ll see this:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/ad0f9-1124pygepuutuaruawf3kxw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Now drop the “<strong>Source</strong>” dropdown menu to either “<strong>master branch</strong>” or “<strong>master branch/docs folder</strong>.” Here’s the thing: if you want to publish from your “docs” folder, you seriously need to have a “docs” folder in your master branch from which you want to run your website!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/76e15-10tb2ykgfgsd_re_-mm_vpa.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Chances are, if you’re a beginner, you’ll choose “master branch,” which just means that you want to publish your repository pretty much as-is. (There have been a couple of times where I needed to tweak a file path or two, depending on how I had my folders structured.)</p>



<p class="wp-block-paragraph">You’re going to see a notification that your site is ready to be published.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/1daee-1vng_2zieddocdg8ey2xwza.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Be patient, wait a minute or two, and then refresh the page or try the link if you want. Once your site has been published, you’ll see this:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/0cf03-1fpwud89eqfuwrcv7gelaxw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Try clicking on that link.</p>



<p class="wp-block-paragraph">Poof! You have a free website! This could just as easily be a free portfolio or blog!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/2e664-1blsmv1wdxbi46sdpq0g6kw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Congratulations!!!</p>



<p class="wp-block-paragraph">Now for the other option:</p>



<h4 class="wp-block-heading" id="1084">I don’t even know how to get&nbsp;started!</h4>



<p class="wp-block-paragraph">I’m going to tackle things like Bootstrap and basic website design another time and focus on the absolute basics here. I do want you to know, though, that the world is your oyster! The only thing limiting your options here is your drive to make it happen. (Well, maybe drive and also the amount of time you have available…) Because this option is for the complete beginner, I’m going to show you how to do everything right on the GitHub website.</p>



<p class="wp-block-paragraph">We’ll go ahead and create a new repository first.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/fb2cc-14x01m3jkfxbxc8srdqlvsw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Fill in your repository name, a short description, check the box that says “<strong>Initialize this repository with a README</strong>,” and then click “<strong>Create repository</strong>.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/b0633-1mab6bniznp1kszchl-rola.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Now go to “<strong>Settings</strong>” near the top right-hand side of the screen and then scroll down to the “<strong>GitHub Pages</strong>” section. Drop the dropdown menu that says “<strong>None</strong>” to “<strong>master branch</strong>.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/76e15-10tb2ykgfgsd_re_-mm_vpa.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">First, you’ll see this:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/0dc9d-1cneumzldxrdq3_pzgbgfhw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Wait a minute or two, and then you’ll see this:</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/151ac-16n0htgmcwencsvtiaamsfg.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Now click the link!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/60ab9-1skhng8qwqtdjmp_hjp0-da.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">You have a website! Congratulations!</p>



<h4 class="wp-block-heading" id="3c5c">That doesn’t look like&nbsp;much</h4>



<p class="wp-block-paragraph">Okay, that does look pretty boring, but you can see here that what’s displaying is your README file.</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/af307-1r2oawbigrt5uksticd5rjq.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">If you want to make some quick changes, you can go ahead and edit your README to display what you want people to see. To do that, go back into your repository, click the little pencil icon on your README file, and make it better!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/ace0c-1f4ikuxmda4xgzm9f1dl-gw.png?w=1080&#038;ssl=1" alt=""/><figcaption>Edit your README&nbsp;file</figcaption></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/659a9-1glz64-7wz5a-ix0gpf0jeq.png?w=1080&#038;ssl=1" alt=""/><figcaption>Editing the file (you’re working with a Markdown&nbsp;file)</figcaption></figure>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/05c78-1-p_zknal-zaaqrsw7im2jw.png?w=1080&#038;ssl=1" alt=""/><figcaption>How the file looks with a few&nbsp;edits!</figcaption></figure>



<p class="wp-block-paragraph">You’re using Markdown, and there are a lot of things you can do with markdown files. This includes adding text, images, links, colors, and some basic formatting. It’s a seriously simple way to start! Here’s the&nbsp;<a href="https://www.markdownguide.org/basic-syntax/" rel="noreferrer noopener" target="_blank">Markdown Guide to basic syntax</a>&nbsp;for anyone who hasn’t worked with it before.</p>



<p class="wp-block-paragraph">(Remember that if you add any images to your README, you want to make sure to upload them to your repository, or GitHub won’t know what you want!)</p>



<p class="wp-block-paragraph">Now go back to your website and see what you have!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/32b99-1ejuu5o_4zzgaxxdfvf3zlq.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Be aware that it sometimes takes a few minutes for the changes to go through. If you don’t see your changes immediately, wait a few minutes and try again. I’ve also had an issue where my laptop wanted to keep going back to an older version of my website even though I’d made changes. Deleting my browser history for the last 24 hours fixed that problem. Try the easy fixes before you&nbsp;freak out&nbsp;about the complicated stuff!</p>



<h4 class="wp-block-heading" id="7811">That’s an improvement, but it could be more interesting</h4>



<p class="wp-block-paragraph">If you’re a total beginner and you don’t know anything about CSS, but you want a little more visual appeal, try a&nbsp;Jekyll&nbsp;theme! They’re prebuilt themes that you can use to make your site look a little better with basically no effort on your end. Jekyll and GitHub will do the work for you! Your job is to push a button or two.</p>



<p class="wp-block-paragraph">Go back to the “<strong>GitHub Pages</strong>” section in “<strong>Settings</strong>” and click on “<strong>Choose a theme</strong>.”</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/ad0f9-1124pygepuutuaruawf3kxw.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">Let’s see what our website looks like if we choose the first theme that shows up. All you have to do is press the green “<strong>Select theme</strong>” button, give it a couple of minutes, and then try your website again!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/72b19-1o4-x5hr-iv5quvz5do4p8q.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">And with just a few minutes of effort, we’re already getting somewhere!</p>



<figure class="wp-block-image"><img data-recalc-dims="1" decoding="async" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/05/3b7c6-1etbtaywkh8r44rs00lrb_g.png?w=1080&#038;ssl=1" alt=""/></figure>



<p class="wp-block-paragraph">That’s it! In just a few minutes, you created your own free website for your business, blog, or even your very own free portfolio site, hosted it through a GitHub repository, and it’s already up and running. You’re ready to share with the world!</p>



<p class="wp-block-paragraph">Way to go!!!</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Just a couple of notes:</p></blockquote>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>* If you decide that you don’t want to use a theme after all, there’s no button to go back to the original version. It’s actually totally easy to get rid of your theme, though! If you go back to your repository, you’ll discover that you now have a file called “<strong>_config.yml</strong>” which contains your theme information. If you delete that file, you delete the theme!</p></blockquote>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>* If you want to play around with your theme and theme options, you’ll find that the “<strong>_config.yml</strong>” file is your first stop. Now that you know that, take a look at the zillions of other Jekyll options that you have! You can even start with the&nbsp;<a href="https://www.jekyllnow.com/" rel="noreferrer noopener" target="_blank">Jekyll Now</a>&nbsp;theme if you want a simple and already set-up blog. Your options are endless!</p></blockquote>



<p class="wp-block-paragraph">I can’t wait to see what you create! As always, if you make anything amazing with this information, let everyone know about it in the comments below or reach out any time on Twitter&nbsp;<a rel="noreferrer noopener" href="https://twitter.com/annebonnerdata" target="_blank">@annebonnerdata</a>. Feel free to share your free portfolios and blogs here for everyone to see!</p>



<p class="wp-block-paragraph">Thanks for reading!</p>



<div style="display:none;"><figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="683" height="1024" data-attachment-id="932" data-permalink="https://contentsimplicity.com/how-to-write-and-publish-articles-that-get-noticed/pin_cs_main4/" data-orig-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main4.png?fit=735%2C1102&amp;ssl=1" data-orig-size="735,1102" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pin_cs_main4" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main4.png?fit=200%2C300&amp;ssl=1" data-large-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main4.png?fit=683%2C1024&amp;ssl=1" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main4.png?resize=683%2C1024&#038;ssl=1" alt="" class="wp-image-932" srcset="https://contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main4-683x1024.png 683w, https://contentsimplicity.com/wp-content/uploads/2019/09/pin_cs_main4-480x720.png 480w" sizes="auto, (min-width: 0px) and (max-width: 480px) 480px, (min-width: 481px) 683px, 100vw" /></figure></div>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/how-to-create-a-free-portfolio/">How to Create a Totally Free Portfolio or Website</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/how-to-create-a-free-portfolio/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">319</post-id>	</item>
		<item>
		<title>Intro to Deep Learning</title>
		<link>https://contentsimplicity.com/intro-to-deep-learning/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=intro-to-deep-learning&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=intro-to-deep-learning</link>
					<comments>https://contentsimplicity.com/intro-to-deep-learning/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Fri, 12 Apr 2019 15:55:19 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=89</guid>

					<description><![CDATA[<p>We live in a world where, for better and for worse, we are constantly surrounded by deep learning algorithms.  In fact, you’re probably reading this article right now because a deep learning algorithm thinks you should see it.</p>
<p>The post <a href="https://contentsimplicity.com/intro-to-deep-learning/">Intro to Deep Learning</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Photo by ibjennyjenny on <a href="http://pixabay.com">Pixabay</a></p>
<p>What&#8217;s going on behind a deep learning algorithm?</p>
<p><em><a href="https://towardsdatascience.com/intro-to-deep-learning-c025efd92535" target="_blank" rel="noopener noreferrer">(This article first appeared in Towards Data Science)</a></em></p>
<p><span style="font-size: inherit;">We live in a world where, for better and for worse, we are constantly surrounded by deep learning algorithms. From social network filtering to driverless cars to movie recommendations, and from financial fraud detection to drug discovery to medical image processing (</span><em class="markup--em markup--p-em" style="font-size: inherit;">…is that bump cancer?</em><span style="font-size: inherit;">), the field of deep learning influences our lives and our decisions every single day.</span></p>
<p id="85dd" class="graf graf--p graf-after--p">In fact, you’re probably reading this article right now because a deep learning algorithm thinks you should see it.</p>
<figure id="de29" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/840d0-62b83-1jg4f0jy3w64tc7k5e_7zlg.jpeg?resize=640%2C427" width="640" height="427" /><p class="wp-caption-text"><em>Photo by tookapic on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="a258" class="graf graf--p graf-after--figure">If you’re looking for the basics of deep learning, artificial neural networks, convolutional neural networks, (neural networks in general…), backpropagation, gradient descent, and more, you’ve come to the right place. In this series of articles, I’m going to explain these concepts as simply and comprehensibly as I can.</p>
<p id="7409" class="graf graf--p graf-after--p">There will also be cats.</p>
<p id="4c5f" class="graf graf--p graf-after--p">Learning is so much easier when it’s sprinkled with a little silly.</p>
<figure id="cbe7" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/abf4c-89753-1dpdfhwpw1o73tv28boqekq.jpeg?resize=640%2C479" width="640" height="479" /><p class="wp-caption-text"><em>Photo by skeeze on <a class="markup--anchor markup--figure-anchor" href="http://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="f305" class="graf graf--p graf-after--figure">If you get into deep learning, there’s an incredible amount of really in-depth information out there. I’ll make sure to provide additional resources along the way for anyone who wants to swim a little deeper into these waters. (For example, you might want to check out <a class="markup--anchor markup--p-anchor" href="http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf" target="_blank" rel="noopener noreferrer">Efficient BackProp by Yann LeCun, et al.</a>, which is written by one of the most important figures in deep learning. This paper looks specifically at backpropagation, but also discusses some of the most important topics in deep learning, like gradient descent, stochastic learning, batch learning, and so on. It’s all here if you want to take a look!)</p>
<p id="83dd" class="graf graf--p graf-after--p">For now, let’s jump right in!</p>
<figure id="4545" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*mLaXJxEBhdJcqtN6" width="1600" height="1200" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@laurinebailly?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Laurine Bailly</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<h4 id="7598" class="graf graf--h4 graf-after--figure">What is deep learning?</h4>
<p id="8772" class="graf graf--p graf-after--h4">Really, it’s just <strong class="markup--strong markup--p-strong"><em class="markup--em markup--p-em">learning from examples</em></strong>. That’s pretty much the deal.</p>
<p id="edf4" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">At a very basic level, </span><span class="markup--quote markup--p-quote is-other">deep learning is a machine learning technique that teaches a computer to filter inputs (observations in the form of images, text, or sound) through layers in order to learn how to predict and classify information.</span></p>
<p id="3150" class="graf graf--p graf-after--p">Deep learning algorithms are inspired by the way that the human brain filters information!</p>
<figure id="b2ff" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*OG7WtN_-633kPUFi" width="1600" height="2399" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@chrisjoelcampbell?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Christopher Campbell</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="744c" class="graf graf--p graf-after--figure">Essentially, deep learning is a part of the machine learning family that’s based on <em class="markup--em markup--p-em">learning data representations</em> (rather than task-specific algorithms). <span class="markup--quote markup--p-quote is-other">Deep learning is actually closely related to a </span><span class="markup--quote markup--p-quote is-other">class of theories about brain development proposed by cognitive neuroscientists in the early ’90s.</span><span class="markup--quote markup--p-quote is-other"> Just like in the brain (or, more accurately, in the theories and model put together by researchers in the 90s regarding the development of the human neocortex)</span>, neural networks use a hierarchy of layered filters in which each layer learns from the previous layer and then passes its output to the next layer.</p>
<p id="00dc" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other"><strong class="markup--strong markup--p-strong"><em class="markup--em markup--p-em">Deep learning attempts to mimic the activity in layers of neurons in the neocortex.</em></strong></span></p>
<p id="e789" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">In the human brain, there are about 100 billion neurons and each neuron is connected to about 100,000 of its neighbors. Essentially, that is what we’re trying to create, but in a way and at a level that works for machines.</span></p>
<figure id="0020" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/c8541-6b1c1-19fva3aodiryt6sb5huasqw.png?resize=640%2C545" width="640" height="545" /><p class="wp-caption-text"><em>Photo by GDJ on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="7d9b" class="graf graf--p graf-after--figure">The purpose of deep learning is to mimic how the human brain works in order to create some real magic.</p>
<p id="3eae" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">What does this mean in terms of neurons, axons, dendrites, and so on? Well, the neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and is transferred to the dendrites of the next neuron. That connection (not an actual physical connection, but a connection nonetheless) where the signal is passed is called a synapse.</span></p>
<figure id="377b" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 492px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/4fb07-fc247-10oxciq4kwvxpxezlwzhotq.png?resize=482%2C640" width="482" height="640" /><p class="wp-caption-text"><em>Photo by mohamed_hassan on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="da7b" class="graf graf--p graf-after--figure">Neurons by themselves are kind of useless, but when you have lots of them, they work together to create some serious magic. That’s the idea behind a deep learning algorithm! You get input from observation, you put your input into one layer that creates an output which in turn becomes the input for the next layer, and so on. This happens over and over until your final output signal!</p>
<p id="aaef" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">So the neuron (or <strong class="markup--strong markup--p-strong">node</strong>) gets a signal or signals (<strong class="markup--strong markup--p-strong">input values</strong>), which pass through the neuron, and that delivers the <strong class="markup--strong markup--p-strong">output signal</strong></span><span class="markup--quote markup--p-quote is-other">. Think of the input layer as your senses: the things you see, smell, feel, etc. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. (You will need to either standardize or normalize these variables so that they’re within the same range.)</span></p>
<p id="6efb" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">What can our output value be? It can be <strong class="markup--strong markup--p-strong">continuous</strong> (for example, price),<strong class="markup--strong markup--p-strong">binary</strong> (yes or no), or <strong class="markup--strong markup--p-strong">categorical</strong> (cat, dog, moose, hedgehog, sloth, etc.).</span> If it’s categorical you want to remember your output value won’t be just one variable, but several output variables.</p>
<figure id="e387" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*K73BAUskB1Iw3_MI" width="1600" height="1413" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@hanialistek?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Hanna Listek</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="142d" class="graf graf--p graf-after--figure">Also, <span class="markup--quote markup--p-quote is-other">keep in mind that you<strong class="markup--strong markup--p-strong">r output value will always be related to the same single observation from the input values</strong></span>. If your input values were, for example, an observation of the age, salary, and vehicle of one person, your output value would also relate to the same observation of the same person. This sounds pretty basic, but it’s important to keep in mind.</p>
<p id="72cc" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">What about <strong class="markup--strong markup--p-strong">synapses</strong>? Each of the synapses gets assigned weights, which are crucial to <strong class="markup--strong markup--p-strong">Artificial Neural Networks</strong> (ANNs). Weights are how ANNs learn. By adjusting the weights, the ANN decides to what extent signals get passed along. </span><span class="markup--quote markup--p-quote is-other">When you’re training your network, you’re deciding how the weights are adjusted.</span></p>
<p id="df11" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">What happens inside the <strong class="markup--strong markup--p-strong">neuron</strong>? First, all of the values that it’s getting are added up (the <strong class="markup--strong markup--p-strong">weighted sum </strong>is calculated). Next, it applies an activation function, which is a function that’s applied to this particular neuron. From that, the neuron understands if it needs to pass along a signal or not.</span></p>
<p id="035b" class="graf graf--p graf-after--p">This is repeated thousands or even hundreds of thousands of times in a deep learning algorithm!</p>
<figure id="105b" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/568bb-d327c-1g6zlsges6n04fwsu5gnncw.jpeg?resize=640%2C452" width="640" height="452" /><p class="wp-caption-text">Photo by Geralt on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="7df9" class="graf graf--p graf-after--figure"><span class="markup--quote markup--p-quote is-other">We create an artificial neural net where we have nodes for <strong class="markup--strong markup--p-strong">input values</strong> (what we already know/what we want to predict) and <strong class="markup--strong markup--p-strong">output values</strong> (our predictions) and in between those, we have a hidden layer (or layers) where the information travels before it hits the output</span>. This is analogous to the way that the information you see through your eyes is filtered into your understanding, rather than being shot straight into your brain.</p>
<figure id="888f" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 580px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/85893-133f3-13jmfwqxfwjxqmn4-mak-mg.png?resize=570%2C640" width="570" height="640" /><p class="wp-caption-text"><em>Image by Geralt on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="6ced" class="graf graf--p graf-after--figure"><span class="markup--quote markup--p-quote is-other">Deep learning models can be supervised, semi-supervised, and unsupervised.</span></p>
<h3 id="dc2d" class="graf graf--h4 graf-after--p">Say what?</h3>
<h4 id="84df" class="graf graf--p graf-after--h4"><strong class="markup--strong markup--p-strong">Supervised learning</strong></h4>
<p id="b5b8" class="graf graf--p graf-after--p">Are you into psychology? This is essentially the machine version of “concept learning.” You know what a concept is (for example an object, idea, event, etc.) based on the belief that each object/idea/event has common features.</p>
<p id="728b" class="graf graf--p graf-after--p">The idea here is that you can be shown a set of example objects with their labels and learn to classify objects based on what you have already been shown. You simplify what you’ve learned from what you’ve been shown, condense it in the form of an example, and then you take that simplified version and apply it to future examples. We really just call this “learning from examples.”</p>
<figure id="0898" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*w-AiI00qdZB8SGC0" width="1600" height="2396" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@gaellemm?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Gaelle Marcel</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="bcfb" class="graf graf--p graf-after--figure">(Dress that baby up a little and it looks like this: <span class="markup--quote markup--p-quote is-other"><em class="markup--em markup--p-em">concept learning refers to the process of inferring a Boolean-valued function from training examples of its input and output</em>.</span>)</p>
<p id="6716" class="graf graf--p graf-after--p">In a nutshell, <span class="markup--quote markup--p-quote is-other">supervised machine learning is the task of learning a function that maps an input to an output based on example input-output pairs. It works with <strong class="markup--strong markup--p-strong">labeled training data</strong> made up of training examples.</span> <span class="markup--quote markup--p-quote is-other">Each example is a pair that’s made up of an input object (usually a vector) and the output value that you want (also called the supervisory signal).</span> <span class="markup--quote markup--p-quote is-other">Your algorithm supervises the training data and produces an inferred function which can be used to map new examples. Ideally, the algorithm will allow you to classify examples that it hasn’t seen before.</span></p>
<p id="016e" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other"><em class="markup--em markup--p-em">Basically, it looks at stuff with labels and uses what it learns from the labeled stuff to predict the labels of other stuff.</em></span></p>
<p id="0412" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other"><strong class="markup--strong markup--p-strong">Classification tasks</strong> tend to depend on supervised learning.</span> These tasks might include</p>
<ul class="postList">
<li id="78ce" class="graf graf--li graf-after--p">Detecting faces, identities, and facial expressions in images</li>
<li id="90a9" class="graf graf--li graf-after--li">Identifying objects in images like stop signs, pedestrians, and lane markers</li>
<li id="fb63" class="graf graf--li graf-after--li">Classifying text as spam</li>
<li id="837f" class="graf graf--li graf-after--li">Recognizing gestures in videos</li>
<li id="b72c" class="graf graf--li graf-after--li">Detecting voices and identifying sentiment in audio recordings</li>
<li id="0448" class="graf graf--li graf-after--li">Identifying speakers</li>
<li id="8910" class="graf graf--li graf-after--li">Transcribing speech-to-text</li>
</ul>
<h4 id="125f" class="graf graf--p graf-after--li"><strong class="markup--strong markup--p-strong">Semi-supervised learning</strong></h4>
<p id="bedf" class="graf graf--p graf-after--p">This one is more like the way you learned from the combination of what your parents explicitly told you as a child (labeled information) combined with what you learned on your own that didn’t have labels, like the flowers and trees that you observed without naming or counting them.</p>
<figure id="d046" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*iXTrVeXwrqbGRRqB" width="1600" height="1107" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@robbie36?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Robert Collins</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="1361" class="graf graf--p graf-after--figure"><span class="markup--quote markup--p-quote is-other">Semi-supervised learning does the same kind of thing as supervised learning, but it’s able to make use of <strong class="markup--strong markup--p-strong">both labeled and unlabeled data</strong> for training.</span><span class="markup--quote markup--p-quote is-other"> In semi-supervised learning, you’re often looking at a lot of unlabeled data and a little bit of labeled data</span>. There are a number of researchers out there who have found that this process can provide more accuracy than unsupervised learning but without the time and costs associated with labeled data. (Sometimes labeling data requires a skilled human being to do things like transcribe audio files or analyze 3D images in order to create labels, which can make creating a fully labeled data set pretty unfeasible, especially when you’re working with those massive data sets that deep learning tasks love.)</p>
<p id="65bb" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">Semi-supervised learning can be referred to as <strong class="markup--strong markup--p-strong">transductive</strong> (inferring correct labels for the given data) or <strong class="markup--strong markup--p-strong">inductive</strong> (inferring the correct mapping from X to Y).</span></p>
<p id="c6f3" class="graf graf--p graf-after--p">In order to do this, deep learning algorithms have to make at least one of the following assumptions:</p>
<ul class="postList">
<li id="0b87" class="graf graf--li graf-after--p">Points that are close to each other probably share a label (<strong class="markup--strong markup--li-strong">continuity assumption</strong>)</li>
<li id="2155" class="graf graf--li graf-after--li">The data like to form clusters and the points that are clustered together probably share a label (<strong class="markup--strong markup--li-strong">cluster assumption</strong>)</li>
<li id="87cf" class="graf graf--li graf-after--li">The data lie on a manifold of lower dimension than the input space (<strong class="markup--strong markup--li-strong">manifold assumption</strong>). Okay, that’s complicated, but think of it as if you were trying to analyze someone talking — you’d probably want to look at her facial muscles moving her face and her vocal cords making sound and stick to that area, rather than looking in the space of all images and/or all acoustic waves.</li>
</ul>
<h4 id="b590" class="graf graf--p graf-after--li"><strong class="markup--strong markup--p-strong">Unsupervised learning </strong>(aka <em class="markup--em markup--p-em">Hebbian Learning</em>)</h4>
<p id="8ee0" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">Unsupervised learning involves learning the relationships between elements in a data set and classifying the data without the help of labels.</span> There are a lot of algorithmic forms that this can take, but they all have the same goal of mimicking human logic by searching for hidden structures, features, and patterns in order to analyze new data. These algorithms can include <strong class="markup--strong markup--p-strong">clustering, anomaly detection, neural networks</strong>, and more.</p>
<p id="af9e" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Clustering</strong> is essentially the detection of similarities or anomalies within a data set and is a good example of an unsupervised learning task. Clustering can produce highly accurate search results by comparing documents, images, or sounds for similarities and anomalies. Being able to go through a huge amount of data to cluster “ducks” or the perhaps the sound of a voice has many, many potential applications. Being able to detect anomalies and unusual behavior accurately can be extremely beneficial for applications like security and fraud detection.</p>
<figure id="35de" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*LDqOE3_KiaV3M7sj" width="1600" height="1066" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@andreuuuw?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Andrew Wulf</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<h3 id="d408" class="graf graf--h4 graf-after--figure">Back to it!</h3>
<p id="0e45" class="graf graf--p graf-after--h4">Deep learning algorithms and architectures have been applied to social network filtering, image recognition, financial fraud detection, speech recognition, computer vision, medical image processing, natural language processing, visual art processing, drug discovery and design, toxicology, bioinformatics, customer relationship management, audio recognition, and many, many other fields and concepts. Deep learning models are everywhere!</p>
<p id="6a7e" class="graf graf--p graf-after--p">There are, of course, a number of deep learning techniques that exist, like <strong class="markup--strong markup--p-strong">convolutional neural networks</strong>, <strong class="markup--strong markup--p-strong">recurrent neural networks</strong>, and so on. No one network is better than the others, but some are definitely better suited to specific tasks.</p>
<h3 id="24cf" class="graf graf--h4 graf-after--p">Deep Learning and Artificial Neural Networks</h3>
<p id="b4b7" class="graf graf--p graf-after--h4">The majority of modern deep learning architectures are based on Artificial Neural Networks (ANNs) and use multiple layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output of the previous layer for its input. What they learn forms a hierarchy of concepts where each level learns to transform its input data into a slightly more abstract and composite representation.</p>
<figure id="c782" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/6cbda-fbeef-1d9v2-vusuyrn8vm6i4cdmw.png?resize=640%2C360" width="640" height="360" /><p class="wp-caption-text"><em>Image by ahmedgad on <a class="markup--anchor markup--figure-anchor" href="http://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="e5a5" class="graf graf--p graf-after--figure">That means that for an image, for example, the input might be a matrix of pixels, then the first layer might encode the edges and compose the pixels, then the next layer might compose an arrangement of edges, then the next layer might encode a nose and eyes, then the next layer might recognize that the image contains a face, and so on. While you may need to do a little fine tuning, the deep learning process learns which features to place in which level on its own!</p>
<figure id="856e" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*28856oUBRWadsWpf" width="1600" height="2396" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@cristian_newman?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Cristian Newman</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="e1f5" class="graf graf--p graf-after--figure">The “deep” in deep learning just refers to the number of layers through which the data is transformed (they have a substantial credit assignment path (CAP), which is the chain of transformations from input to output). For a <strong class="markup--strong markup--p-strong">feedforward neural network</strong>, the depth of the CAPs is that of the network and the number of hidden layers plus one (the output layer). For a <strong class="markup--strong markup--p-strong">recurrent neural network</strong>, a signal might propagate through a layer more than once, so the CAP depth is potentially unlimited! Most researchers agree that deep learning involves CAP depth &gt;2.</p>
<h3 id="5a2b" class="graf graf--h4 graf-after--p">Convolutional Neural Networks</h3>
<p id="7047" class="graf graf--p graf-after--h4">One of the most popular types of neural networks is convolutional neural networks (CNNs). <span class="markup--quote markup--p-quote is-other">The CNN convolves (not convolutes…) learned features with input data and uses 2D convolutional layers, which means that this type of network is ideal for processing (2D) images.</span> <span class="markup--quote markup--p-quote is-other">The CNN works by extracting features from images, meaning that the need for manual feature extraction is eliminated. The features are not trained! They’re learned while the network trains on a set of images, which makes deep learning models extremely accurate for computer vision tasks.</span> CNNs learn feature detection through tens or hundreds of hidden layers, with each layer increasing the complexity of the learned features.</p>
<p id="dd88" class="graf graf--p graf-after--p"><a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb" target="_blank" rel="noopener noreferrer"><strong class="markup--strong markup--p-strong">If you want to keep going with me, we tackle CNNs in depth in part 3!</strong></a></p>
<p id="303e" class="graf graf--p graf-after--p">(Want to learn more? Check out <a class="markup--anchor markup--p-anchor" href="https://web.stanford.edu/class/cs231a/lectures/intro_cnn.pdf" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Introduction to Convolutional Neural Networks</em></a> by Jianxin Wu and Yann LeCun’s original article, <a class="markup--anchor markup--p-anchor" href="http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Gradient-Based Learning Applied to Document Recognition</em></a>.)</p>
<h3 id="c6ac" class="graf graf--h4 graf-after--p">Recurrent neural networks</h3>
<p id="97a7" class="graf graf--p graf-after--h4"><span class="markup--quote markup--p-quote is-other">While convolutional neural networks are typically used for processing images, recurrent neural networks (RNNs) are used for processing language. RNNs don’t just filter information from one layer into the next, they have built-in feedback loops where the output from one layer might be fed back into the layer preceding it. This actually lends the network a sort of memory.</span></p>
<h3 id="bd73" class="graf graf--h4 graf-after--p">Generative adversarial networks</h3>
<p id="34dd" class="graf graf--p graf-after--h4"><span class="markup--quote markup--p-quote is-other">In generative adversarial networks (GANs), two neural networks fight it out. The generator network tries to create convincing “fake” data while the discriminator tries to tell the difference between the fake data and the real stuff. With each training cycle, the generator gets better at creating fake data and the discriminator gets sharper at spotting the fakes. By pitting the two against each other during training, both networks improve.</span> (Basically, shirts vs. skins here. The home team is playing itself to improve its game.) GANs can be used for extremely interesting applications, including generating images from written text. GANs can be tough to work with, but more robust models are constantly being developed.</p>
<figure id="d807" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/53363-9677f-19kzl1kqamd56dpcq-bzc-q.jpeg?w=1080" /></div>
</div>
</figure>
<h3 id="2cb2" class="graf graf--h4 graf-after--figure">Deep Learning in the Future</h3>
<p id="fbcd" class="graf graf--p graf-after--h4">The future is full of potential for anyone interested in deep learning. The most remarkable thing about a neural network is its ability to deal with vast amounts of disparate data. That becomes more and more relevant now that we’re living in an era of advanced smart sensors which can gather an unbelievable amount of data every second of every day. It’s estimated that we are currently generating 2.6 quintillion bytes of data every single day. This is an <em class="markup--em markup--p-em">enormous</em> amount of data. While traditional computers have trouble dealing with and drawing conclusions from so much data, <strong class="markup--strong markup--p-strong"><em class="markup--em markup--p-em">deep learning actually becomes more efficient as the amount of data grows larger</em></strong>. Neural nets are capable of discovering latent structures within vast amounts of unstructured data, like raw media for example, which are the majority of data in the world.</p>
<p id="591e" class="graf graf--p graf-after--p">The possibilities are endless!</p>
<p id="9ae0" class="graf graf--p graf-after--p"><em class="markup--em markup--p-em">Still with me? </em><a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/simply-deep-learning-an-effortless-introduction-45591a1c4abb" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Check out </em><strong class="markup--strong markup--p-strong"><em class="markup--em markup--p-em">part 2 </em></strong><em class="markup--em markup--p-em">where we take a deeper look at artificial neural networks</em></a><em class="markup--em markup--p-em">. Then </em><a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">head on over to </em><strong class="markup--strong markup--p-strong"><em class="markup--em markup--p-em">part 3</em></strong><em class="markup--em markup--p-em"> where we tackle image classification and convolutional neural networks</em></a><em class="markup--em markup--p-em">!</em></p>
<p id="afce" class="graf graf--p graf-after--p">Want to see how to build a deep learning model from the ground up? Check out this article that tells you exactly <a class="markup--anchor markup--p-anchor" href="https://medium.freecodecamp.org/how-to-build-the-best-image-classifier-3c72010b3d55" target="_blank" rel="noopener noreferrer">how to build an image classifier with PyTorch that has greater than 97% accuracy</a>!</p>
<p id="b229" class="graf graf--p graf-after--p">Need some free GPU, but not sure where to find it? Check out <a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/getting-started-with-google-colab-f2fff97f594c" target="_blank" rel="noopener noreferrer">Getting Started with Google Colab</a>.</p>
<p id="9f33" class="graf graf--p graf-after--p">Have you already finished a machine learning model, but you don’t know what to do with it next?</p>
<p id="a2d6" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Why not deploy it to the internet?</strong></p>
<p id="543b" class="graf graf--p graf-after--p"><a class="markup--anchor markup--p-anchor" href="https://heartbeat.fritz.ai/brilliant-beginners-guide-to-model-deployment-133e158f6717" target="_blank" rel="noopener noreferrer">Check out this article to learn how to deploy your machine learning model with Flask</a>!</p>
<figure id="6b99" class="graf graf--figure graf-after--p graf--trailing">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*ls4ovZlVA2Cjz2ns" width="1600" height="1066" /></p>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>


<p class="wp-block-paragraph">Thank you for reading!</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/intro-to-deep-learning/">Intro to Deep Learning</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/intro-to-deep-learning/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">163</post-id>	</item>
		<item>
		<title>How to build an image classifier with greater than 97% accuracy</title>
		<link>https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-build-an-image-classifier-with-greater-than-97-accuracy&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-build-an-image-classifier-with-greater-than-97-accuracy</link>
					<comments>https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Wed, 10 Apr 2019 08:34:33 +0000</pubDate>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[colab]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[image classification]]></category>
		<category><![CDATA[matplotlib]]></category>
		<category><![CDATA[numpy]]></category>
		<category><![CDATA[pandas]]></category>
		<category><![CDATA[pillow]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[pytorch]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=113</guid>

					<description><![CDATA[<p>How do you teach a computer to correctly identify an image as a flower? How do you teach it to tell you what flower it is when you don’t know what it is?Let me show you!</p>
<p>The post <a href="https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/">How to build an image classifier with greater than 97% accuracy</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="439" data-permalink="https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/pytorch_flower/" data-orig-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/04/pytorch_flower.png?fit=654%2C464&amp;ssl=1" data-orig-size="654,464" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pytorch_flower" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/04/pytorch_flower.png?fit=300%2C213&amp;ssl=1" data-large-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/04/pytorch_flower.png?fit=654%2C464&amp;ssl=1" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/04/pytorch_flower.png?resize=805%2C570&#038;ssl=1" alt="" class="wp-image-439" width="805" height="570"/></figure>


<p><em><a href="https://medium.freecodecamp.org/how-to-build-the-best-image-classifier-3c72010b3d55" target="_blank" rel="noopener noreferrer">(This article first appeared in Free Code Camp)</a></em></p>
<p id="f3cb" class="graf graf--p graf-after--figure">Image classifiers are amazing.</p>
<p class="graf graf--p graf-after--figure">How do you teach a computer to look at an image and correctly identify it as a flower? <strong class="markup--strong markup--p-strong">How do you teach a computer to see an image of a flower and then tell you exactly what species of flower it is when even&nbsp;<em class="markup--em markup--p-em">you</em> don’t know what species it is?</strong></p>
<p id="f023" class="graf graf--p graf-after--p">Let me show you!</p>
<p id="e044" class="graf graf--p graf-after--p">This article will take you through the basics of creating an&nbsp;<span class="markup--quote markup--p-quote is-other">image classifier with PyTorch</span>. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. You could, if you wanted, train this classifier and then export it for use in an application of your own.</p>
<p id="5c21" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">What you do from here depends entirely on you and your imagination.</strong></p>
<p id="581b" class="graf graf--p graf-after--p">I put this article together for anyone out there who’s brand new to all of this and looking for a place to begin. It’s up to you to take this information, improve on it, and make it your own! Build an even better image classifier if you want to!</p>
<p id="e5db" class="graf graf--p graf-after--p"><a class="markup--anchor markup--p-anchor" href="https://github.com/bonn0062/image_classifier_pytorch" target="_blank" rel="noopener noreferrer">If you want to view the notebook, you can find it here.</a></p>
<p id="4571" class="graf graf--p graf-after--p"><em class="markup--em markup--p-em">Because this PyTorch image classifier was built as a final project for a Udacity program, the code draws on code from Udacity which, in turn, draws on the official PyTorch documentation. Udacity also provided a JSON file for label mapping.&nbsp;</em><a class="markup--anchor markup--p-anchor" href="https://github.com/bonn0062/image_classifier_pytorch" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">That file can be found in this GitHub repo</em></a><em class="markup--em markup--p-em">.</em></p>
<p id="0318" class="graf graf--p graf-after--p"><a class="markup--anchor markup--p-anchor" href="http://www.robots.ox.ac.uk/~vgg/data/flowers/102/" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Information about the flower data set can be found here.</em></a><em class="markup--em markup--p-em">&nbsp;The data set includes a separate folder for each of the 102 flower classes. Each flower is labeled as a number and each of the numbered directories holds a number of&nbsp;.jpg files.</em></p>
<h3 id="3fec" class="graf graf--h3 graf-after--p">Let’s make an image classifier!</h3>
<figure id="7d27" class="graf graf--figure graf-after--h3">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*7lMC7ZUfjrBG0fgc"></div>
</div><figcaption class="imageCaption">Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@anniespratt?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener nofollow noreferrer">Annie Spratt</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener nofollow noreferrer">Unsplash</a></figcaption></figure>
<p id="b55d" class="graf graf--p graf-after--figure">Because this is a neural network using a larger dataset than my CPU could handle in any reasonable amount of time, I went ahead and set up my image classifier in&nbsp;<a class="markup--anchor markup--p-anchor" href="https://colab.research.google.com/" target="_blank" rel="noopener noreferrer">Google Colab</a>. Colab is truly awesome because it provides&nbsp;<strong class="markup--strong markup--p-strong">free GPU</strong>. (If you’re new to Colab,&nbsp;<a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/getting-started-with-google-colab-f2fff97f594c" target="_blank" rel="noopener noreferrer">check out this article on getting started with Google Colab</a>!)</p>
<p id="1093" class="graf graf--p graf-after--p">Because I was using Colab, I needed to start by importing&nbsp;<span class="markup--quote markup--p-quote is-other">PyTorch</span><span class="markup--quote markup--p-quote is-other">. You don’t need to do this if you aren’t using Colab</span>.</p>
<p id="1255" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">*** UPDATE! (01/29)*** Colab now supports native PyTorch!!! You shouldn’t need to run the code below, but I’m leaving it up just in case anyone is having any issues!</strong></p>
<p id="c2bc" class="graf graf--p graf-after--pre">Then, after having some trouble with Pillow (it’s buggy in Colab!), I just went ahead and ran this:</p>
<pre id="9d48" class="graf graf--pre graf-after--p">import PIL
print(PIL.PILLOW_VERSION)</pre>
<p id="4269" class="graf graf--p graf-after--pre">If&nbsp;<span class="markup--quote markup--p-quote is-other">you</span>&nbsp;get anything below 5.3.0, use the dropdown menu under “Runtime” to “Restart runtime” and run this cell again. You should be good to go!</p>
<p id="7823" class="graf graf--p graf-after--p">You’ll want to be using GPU for this project, which is incredibly simple to set up on Colab. You just go to the “runtime” dropdown menu, select “change runtime type” and then select “GPU” in the hardware accelerator drop-down menu!</p>
<figure id="82a3" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/3259a-34841-17du2-qmemro5nvxnepnukq.png?w=1080"></div>
</div>
</figure>
<p id="7f2d" class="graf graf--p graf-after--figure">Then I like to run</p>
<pre id="273a" class="graf graf--pre graf-after--p">train_on_gpu = torch.cuda.is_available()</pre>
<pre id="aeb7" class="graf graf--pre graf-after--pre">if not train_on_gpu:
    print('Bummer!  Training on CPU ...')
else:
    print('You are good to go!  Training on GPU ...')</pre>
<p id="ae0b" class="graf graf--p graf-after--pre">just to make sure it’s working. Then run</p>
<pre id="011a" class="graf graf--pre graf-after--p">device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")</pre>
<p id="9ddd" class="graf graf--p graf-after--pre">to define the device.</p>
<p id="0d6c" class="graf graf--p graf-after--p">After this, import the files. There are a ton of ways to do this, including mounting your Google Drive if you have your dataset stored there, which is actually really&nbsp;<span class="markup--quote markup--p-quote is-other">simple</span>. Even though I didn’t wind up finding that to be the most useful solution, I’m including that below, just because it’s so easy and useful.</p>
<pre id="ff74" class="graf graf--pre graf-after--p">from google.colab import drive
drive.mount('/content/gdrive')</pre>
<p id="b4dc" class="graf graf--p graf-after--pre">Then you’ll see a link, click on that, allow access, copy the code that pops up, paste it in the box, hit enter, and you’re good to go! If you don’t see your drive in the side box on the left, just hit “refresh” and it should show up.</p>
<p id="f763" class="graf graf--p graf-after--p">(Run the cell, click the link, copy the code on the page, paste it in the box, hit enter, and you’ll see this when you’ve successfully mounted your drive):</p>
<figure id="c927" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/032f4-ca043-1vrrlothfotpy_2rutzv3yg.png?w=1080"></div>
</div>
</figure>
<p id="eaaf" class="graf graf--p graf-after--figure">It’s actually super easy!</p>
<p id="cad4" class="graf graf--p graf-after--p">However, if you’d rather download a shared zip file link (this wound up being easier and faster for this project), you can use:</p>
<pre id="1d29" class="graf graf--pre graf-after--p">!wget 
!unzip</pre>
<p id="3d7c" class="graf graf--p graf-after--pre">For example:</p>
<pre id="fa6d" class="graf graf--pre graf-after--p">!wget -cq <a class="markup--anchor markup--pre-anchor" href="https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip" target="_blank" rel="nofollow noopener noreferrer">https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip</a>
!unzip -qq flower_data.zip</pre>
<p id="fa0a" class="graf graf--p graf-after--pre">That will give you Udacity’s flower data set in seconds!</p>
<p id="bef3" class="graf graf--p graf-after--p">(If you’re uploading small files, you can just upload them directly with some simple code. However, if you want to, you can also just go to the left side of the screen and click “upload files” if you don’t feel like running some simple code to grab a local file.)</p>
<figure id="ac8f" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/8b416-70f77-11lcilc2o0gyzdup_yl260a.png?w=1080"></div>
</div>
</figure>
<p id="8c61" class="graf graf--p graf-after--figure">After loading the data, I imported the libraries I wanted to use for this image classifier:</p>
<pre id="8e51" class="graf graf--pre graf-after--p">%matplotlib inline
%config InlineBackend.figure_format = 'retina'</pre>
<pre id="191e" class="graf graf--pre graf-after--pre">import time
import json
import copy</pre>
<pre id="16f9" class="graf graf--pre graf-after--pre">import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import PIL</pre>
<pre id="e6e2" class="graf graf--pre graf-after--pre">from PIL import Image
from collections import OrderedDict</pre>
<pre id="a15a" class="graf graf--pre graf-after--pre">import torch
from torch import nn, optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
import torchvision
from torchvision import datasets, models, transforms
from torch.utils.data.sampler import SubsetRandomSampler
import torch.nn as nn
import torch.nn.functional as F</pre>
<p id="09f5" class="graf graf--p graf-after--pre">Next come the data transformations! <span class="markup--quote markup--p-quote is-other">You want to make sure to use several different types of transformations on your training set in order to help your program learn as much as it can. You can create a more robust model by training it on flipped, rotated, and cropped images.</span></p>
<p id="3b9a" class="graf graf--p graf-after--p">The means that standard deviations are provided to normalize the image values before passing them to our network, but they can also be found by looking at the mean and standard deviation values of the different dimensions of the image tensors.&nbsp;<a class="markup--anchor markup--p-anchor" href="https://pytorch.org/docs/stable/torchvision/transforms.html" target="_blank" rel="nofollow noopener noreferrer">The official documentation&nbsp;</a>is incredibly helpful here!</p>
<p id="ef9a" class="graf graf--p graf-after--p">For my image classifier, I kept it simple with:</p>
<pre id="ebf3" class="graf graf--pre graf-after--p">data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(30),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ]),
    'valid': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ])
}</pre>
<pre id="9a51" class="graf graf--pre graf-after--pre"># Load the datasets with ImageFolder
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'valid']}</pre>
<pre id="6392" class="graf graf--pre graf-after--pre"># Using the image datasets and the trainforms, define the dataloaders
batch_size = 64
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
                                             shuffle=True, num_workers=4)
              for x in ['train', 'valid']}</pre>
<pre id="575c" class="graf graf--pre graf-after--pre">class_names = image_datasets['train'].classes</pre>
<pre id="0a35" class="graf graf--pre graf-after--pre">dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid']}
class_names = image_datasets['train'].classes</pre>
<p id="4eda" class="graf graf--p graf-after--pre">As you can see above, I also defined the batch size, data loaders, and class names in the code above.</p>
<p id="8a45" class="graf graf--p graf-after--p">To take a very quick look at the data and check my device, I ran:</p>
<pre id="e981" class="graf graf--pre graf-after--p">print(dataset_sizes)
print(device)</pre>
<pre id="da1e" class="graf graf--pre graf-after--pre"><strong class="markup--strong markup--pre-strong">{'train': 6552, 'valid': 818}</strong>
<strong class="markup--strong markup--pre-strong">cuda:0</strong></pre>
<p id="fd86" class="graf graf--p graf-after--pre">Next, we need to do some mapping from the label number and the actual flower name. Udacity provided a JSON file for this mapping to be done simply.</p>
<pre id="e2d1" class="graf graf--pre graf-after--p">with open('cat_to_name.json', 'r') as f:
    cat_to_name = json.load(f)</pre>
<p id="72fe" class="graf graf--p graf-after--pre">In order to test the data loader, run:</p>
<pre id="bb71" class="graf graf--pre graf-after--p">images, labels = next(iter(dataloaders['train']))
rand_idx = np.random.randint(len(images))
# Print(rand_idx)
print("label: {}, class: {}, name: {}".format(labels[rand_idx].item(),
                                               class_names[labels[rand_idx].item()],
                                               cat_to_name[class_names[labels[rand_idx].item()]]))</pre>
<p id="9891" class="graf graf--p graf-after--pre">Now it starts to get even more exciting! A number of models in the last several years have been created by people far, far more qualified than most of us for reuse in computer vision problems.&nbsp;<span class="markup--quote markup--p-quote is-other"><a class="markup--anchor markup--p-anchor" href="https://pytorch.org/docs/stable/torchvision/models.html" target="_blank" rel="nofollow noopener noreferrer">PyTorch makes it easy to load pre-trained models and build on them</a>, which is exactly what we’re going to do for this project. The choice of model is entirely up to you!</span></p>
<p id="db40" class="graf graf--p graf-after--p">Some of the most popular pre-trained models that work well for image classifiers, like ResNet, AlexNet, and VGG, come from the ImageNet Challenge. These pre-trained models allow others to quickly obtain cutting-edge results in computer vision without needing such large amounts of computer power, patience, and time. I actually had great results with DenseNet and decided to use DenseNet161, which gave me very good results relatively quickly.</p>
<p id="0929" class="graf graf--p graf-after--p">You can quickly set this up by running</p>
<pre id="7aa4" class="graf graf--pre graf-after--p">model = models.densenet161(pretrained=True)</pre>
<p id="d6c9" class="graf graf--p graf-after--pre">but it might be more interesting to give yourself a choice of model, optimizer, and scheduler. In order to set up a choice in architecture, run</p>
<pre id="6d85" class="graf graf--pre graf-after--p">model_name = 'densenet' #vgg
if model_name == 'densenet':
    model = models.densenet161(pretrained=True)
    num_in_features = 2208
    print(model)
elif model_name == 'vgg':
    model = models.vgg19(pretrained=True)
    num_in_features = 25088
    print(model.classifier)
else:
    print("Unknown model, please choose 'densenet' or 'vgg'")</pre>
<p id="78f7" class="graf graf--p graf-after--pre">which allows you to quickly set up an alternate model.</p>
<p id="68b9" class="graf graf--p graf-after--p">After that, you can start to build your classifier, using the parameters that work best for you. I went ahead and built</p>
<pre id="a491" class="graf graf--pre graf-after--p">for param in model.parameters():
    param.requires_grad = False</pre>
<pre id="b5ae" class="graf graf--pre graf-after--pre">def build_classifier(num_in_features, hidden_layers, num_out_features):
   
    classifier = nn.Sequential()
    if hidden_layers == None:
        classifier.add_module('fc0', nn.Linear(num_in_features, 102))
    else:
        layer_sizes = zip(hidden_layers[:-1], hidden_layers[1:])
        classifier.add_module('fc0', nn.Linear(num_in_features, hidden_layers[0]))
        classifier.add_module('relu0', nn.ReLU())
        classifier.add_module('drop0', nn.Dropout(.6))
        classifier.add_module('relu1', nn.ReLU())
        classifier.add_module('drop1', nn.Dropout(.5))
        for i, (h1, h2) in enumerate(layer_sizes):
            classifier.add_module('fc'+str(i+1), nn.Linear(h1, h2))
            classifier.add_module('relu'+str(i+1), nn.ReLU())
            classifier.add_module('drop'+str(i+1), nn.Dropout(.5))
        classifier.add_module('output', nn.Linear(hidden_layers[-1], num_out_features))
        
    return classifier</pre>
<p id="ac27" class="graf graf--p graf-after--pre">which allows for an easy way to change the number of hidden layers that I’m using, as well as quickly adjusting the dropout rate. You may decide to add additional ReLU and dropout layers in order to more finely hone your model.</p>
<p id="f7ee" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">Next, work on training your classifier parameters. I decided to make sure I only trained the classifier parameters here while having feature parameters frozen. You can get as creative as you want with your optimizer, criterion, and scheduler. The criterion is the method used to evaluate the model fit, the optimizer is the optimization method used to update the weights, and the scheduler provides different methods for adjusting the learning rate and step size used during optimization.</span></p>
<p id="e38e" class="graf graf--p graf-after--p">Try as many options and combinations as you can to see what gives you the best result.&nbsp;<a class="markup--anchor markup--p-anchor" href="https://pytorch.org/docs/stable/optim.html" target="_blank" rel="nofollow noopener noreferrer">You can see all of the official documentation here.</a>&nbsp;I recommend taking a look at it and making your own decisions about what you want to use. You don’t literally have an infinite number of options here, but it sure feels like it once you start playing around!</p>
<pre id="1fa5" class="graf graf--pre graf-after--p">hidden_layers = None</pre>
<pre id="a291" class="graf graf--pre graf-after--pre">classifier = build_classifier(num_in_features, hidden_layers, 102)
print(classifier)</pre>
<pre id="0d80" class="graf graf--pre graf-after--pre"># Only train the classifier parameters, feature parameters are frozen
if model_name == 'densenet':
    model.classifier = classifier
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adadelta(model.parameters())
    sched = optim.lr_scheduler.StepLR(optimizer, step_size=4)
elif model_name == 'vgg':
    model.classifier = classifier
    criterion = nn.NLLLoss()
    optimizer = optim.Adam(model.classifier.parameters(), lr=0.0001)
    sched = lr_scheduler.StepLR(optimizer, step_size=4, gamma=0.1)
else:
    pass</pre>
<p id="fd77" class="graf graf--p graf-after--pre">Now it’s time to train your model.</p>
<pre id="7875" class="graf graf--pre graf-after--p"># Adapted from <a class="markup--anchor markup--pre-anchor" href="https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html" target="_blank" rel="nofollow noopener noreferrer">https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html</a></pre>
<pre id="9d4d" class="graf graf--pre graf-after--pre">def train_model(model, criterion, optimizer, sched, num_epochs=5):
    since = time.time()</pre>
<pre id="805f" class="graf graf--pre graf-after--pre">best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0</pre>
<pre id="b370" class="graf graf--pre graf-after--pre">for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch+1, num_epochs))
        print('-' * 10)</pre>
<pre id="76ad" class="graf graf--pre graf-after--pre"># Each epoch has a training and validation phase
        for phase in ['train', 'valid']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode</pre>
<pre id="0976" class="graf graf--pre graf-after--pre">running_loss = 0.0
            running_corrects = 0</pre>
<pre id="86ba" class="graf graf--pre graf-after--pre"># Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)</pre>
<pre id="8ba9" class="graf graf--pre graf-after--pre"># Zero the parameter gradients
                optimizer.zero_grad()</pre>
<pre id="98a8" class="graf graf--pre graf-after--pre"># Forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)</pre>
<pre id="eda3" class="graf graf--pre graf-after--pre"># Backward + optimize only if in training phase
                    if phase == 'train':
                        #sched.step()
                        loss.backward()
                        
                        optimizer.step()</pre>
<pre id="0048" class="graf graf--pre graf-after--pre"># Statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)</pre>
<pre id="13ff" class="graf graf--pre graf-after--pre">epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]</pre>
<pre id="fa76" class="graf graf--pre graf-after--pre">print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))</pre>
<pre id="f993" class="graf graf--pre graf-after--pre"># Deep copy the model
            if phase == 'valid' and epoch_acc &gt; best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())</pre>
<pre id="5ea0" class="graf graf--pre graf-after--pre">print()</pre>
<pre id="71b1" class="graf graf--pre graf-after--pre">time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))</pre>
<pre id="2ddc" class="graf graf--pre graf-after--pre"># Load best model weights
    model.load_state_dict(best_model_wts)
    
    return model</pre>
<pre id="acb4" class="graf graf--pre graf-after--pre">epochs = 30
model.to(device)
model = train_model(model, criterion, optimizer, sched, epochs)</pre>
<p id="7528" class="graf graf--p graf-after--pre">I wanted to be able to monitor my epochs easily and also keep track of the time elapsed as my model was running. The code above includes both, and the results are pretty good! You can see that the model is quickly learning and the accuracy on the validation set quickly reached over 95% by epoch 7!</p>
<pre id="d9f3" class="graf graf--pre graf-after--p">Epoch 1/30
----------
train Loss: 2.4793 Acc: 0.4791
valid Loss: 0.9688 Acc: 0.8191

Epoch 2/30
----------
train Loss: 0.8288 Acc: 0.8378
valid Loss: 0.4714 Acc: 0.9010

Epoch 3/30
----------
train Loss: 0.5191 Acc: 0.8890
valid Loss: 0.3197 Acc: 0.9181

Epoch 4/30
----------
train Loss: 0.4064 Acc: 0.9095
valid Loss: 0.2975 Acc: 0.9169

Epoch 5/30
----------
train Loss: 0.3401 Acc: 0.9214
valid Loss: 0.2486 Acc: 0.9401

Epoch 6/30
----------
train Loss: 0.3111 Acc: 0.9303
valid Loss: 0.2153 Acc: 0.9487

Epoch 7/30
----------
train Loss: 0.2987 Acc: 0.9298
valid Loss: 0.1969 Acc: 0.9584</pre>
<pre id="97d6" class="graf graf--pre graf-after--pre">...</pre>
<pre id="6518" class="graf graf--pre graf-after--pre">Training complete in 67m 43s
Best val Acc: 0.973105</pre>
<p id="629a" class="graf graf--p graf-after--pre">You can see that running this code on Google Colab with GPU took just over an hour.</p>
<p id="c247" class="graf graf--p graf-after--p">Now it’s time for evaluation</p>
<pre id="602d" class="graf graf--pre graf-after--p">model.eval()</pre>
<pre id="87e1" class="graf graf--pre graf-after--pre">accuracy = 0</pre>
<pre id="8836" class="graf graf--pre graf-after--pre">for inputs, labels in dataloaders['valid']:
    inputs, labels = inputs.to(device), labels.to(device)
    outputs = model(inputs)
    
    # Class with the highest probability is our predicted class
    equality = (labels.data == outputs.max(1)[1])</pre>
<pre id="c190" class="graf graf--pre graf-after--pre"># Accuracy = number of correct predictions divided by all predictions
    accuracy += equality.type_as(torch.FloatTensor()).mean()
    
print("Test accuracy: {:.3f}".format(accuracy/len(dataloaders['valid'])))</pre>
<pre id="c997" class="graf graf--pre graf-after--pre"><strong class="markup--strong markup--pre-strong">Test accuracy: 0.973</strong></pre>
<p id="165c" class="graf graf--p graf-after--pre">It’s important to save your checkpoint</p>
<pre id="b003" class="graf graf--pre graf-after--p">model.class_to_idx = image_datasets['train'].class_to_idx</pre>
<pre id="6b06" class="graf graf--pre graf-after--pre">checkpoint = {'input_size': 2208,
              'output_size': 102,
              'epochs': epochs,
              'batch_size': 64,
              'model': models.densenet161(pretrained=True),
              'classifier': classifier,
              'scheduler': sched,
              'optimizer': optimizer.state_dict(),
              'state_dict': model.state_dict(),
              'class_to_idx': model.class_to_idx
             }
   
torch.save(checkpoint, 'checkpoint.pth')</pre>
<p id="14ab" class="graf graf--p graf-after--pre">You don’t have to save all of the parameters, but I’m including them here as an example. This checkpoint specifically saves the model with a pre-trained densenet161 architecture, but if you want to save your checkpoint with the two-choice option, you can absolutely do that. Simply adjust the input size and model.</p>
<p id="94d3" class="graf graf--p graf-after--p">Now you’re able to load your checkpoint. If you’re submitting your project into the Udacity workspace, things can get a little tricky.&nbsp;<a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/load-that-checkpoint-51142d44fb5d" target="_blank" rel="noopener noreferrer">Here’s some help with troubleshooting your checkpoint load</a>.</p>
<p id="a355" class="graf graf--p graf-after--p">You can check your keys by running</p>
<pre id="b0bc" class="graf graf--pre graf-after--p">ckpt = torch.load('checkpoint.pth')
ckpt.keys()</pre>
<p id="f13d" class="graf graf--p graf-after--pre">Then load and rebuild your model!</p>
<pre id="0288" class="graf graf--pre graf-after--p">def load_checkpoint(filepath):
    checkpoint = torch.load(filepath)
    model = checkpoint['model']
    model.classifier = checkpoint['classifier']
    model.load_state_dict(checkpoint['state_dict'])
    model.class_to_idx = checkpoint['class_to_idx']
    optimizer = checkpoint['optimizer']
    epochs = checkpoint['epochs']
    
    for param in model.parameters():
        param.requires_grad = False
        
    return model, checkpoint['class_to_idx']</pre>
<pre id="6768" class="graf graf--pre graf-after--pre">model, class_to_idx = load_checkpoint('checkpoint.pth')</pre>
<p id="3ea5" class="graf graf--p graf-after--pre">Want to keep going? It’s a good idea to do some image preprocessing and inference for classification. Go ahead and define your image path and open an image:</p>
<pre id="6696" class="graf graf--pre graf-after--p">image_path = 'flower_data/valid/102/image_08006.jpg'
img = Image.open(image_path)</pre>
<p id="b457" class="graf graf--p graf-after--pre">Process your image and take a look at a processed image:</p>
<pre id="fa7e" class="graf graf--pre graf-after--p">def process_image(image):
    ''' Scales, crops, and normalizes a PIL image for a PyTorch model,
        returns an Numpy array
    '''
    # Process a PIL image for use in a PyTorch model
    # tensor.numpy().transpose(1, 2, 0)
    preprocess = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], 
                             std=[0.229, 0.224, 0.225])
    ])
    image = preprocess(image)
    return image</pre>
<pre id="8a0f" class="graf graf--pre graf-after--pre">def imshow(image, ax=None, title=None):
    """Imshow for Tensor."""
    if ax is None:
        fig, ax = plt.subplots()
    
    # PyTorch tensors assume the color channel is the first dimension
    # but matplotlib assumes is the third dimension
    image = image.numpy().transpose((1, 2, 0))
    
    # Undo preprocessing
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    image = std * image + mean
    
    # Image needs to be clipped between 0 and 1 or it looks like noise when displayed
    image = np.clip(image, 0, 1)
    
    ax.imshow(image)
    
    return ax</pre>
<pre id="91a7" class="graf graf--pre graf-after--pre">with Image.open('flower_data/valid/102/image_08006.jpg') as image:
    plt.imshow(image)</pre>
<pre id="74e1" class="graf graf--pre graf-after--pre">model.class_to_idx = image_datasets['train'].class_to_idx</pre>
<div class="aspectRatioPlaceholder is-locked">
<figure id="a4ec" class="graf graf--figure graf-after--pre">
<div class="aspectRatioPlaceholder is-locked">
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/50a39-30be0-1nq6ozkt-6f6jzmxvf1vg7g.png?w=1080"></div>
</div>
</figure>
<p id="b68e" class="graf graf--p graf-after--figure">Create a function for prediction:</p>
<pre id="8e9d" class="graf graf--pre graf-after--p">def predict2(image_path, model, topk=5):
    ''' Predict the class (or classes) of an image using a trained deep learning model.
    '''
    
    # Implement the code to predict the class from an image file
    img = Image.open(image_path)
    img = process_image(img)
    
    # Convert 2D image to 1D vector
    img = np.expand_dims(img, 0)
    
    
    img = torch.from_numpy(img)
    
    model.eval()
    inputs = Variable(img).to(device)
    logits = model.forward(inputs)
    
    ps = F.softmax(logits,dim=1)
    topk = ps.cpu().topk(topk)
    
    return (e.data.numpy().squeeze().tolist() for e in topk)</pre>
<p id="ca9d" class="graf graf--p graf-after--pre">Once the images are in the correct format, you can write a function to make predictions with your model. One common practice is to predict the top 5 or so (usually called top-KK) most probable classes. You’ll want to calculate the class probabilities then find the KK largest values.</p>
<p id="f801" class="graf graf--p graf-after--p">To get the top KK largest values in a tensor use k.topk(). This method returns both the highest k probabilities and the indices of those probabilities corresponding to the classes. You need to convert from these indices to the actual class labels using class_to_idx, which you added to the model or from the Image Folder you used to load the data. Make sure to invert the dictionary so you get a mapping from index to class as well.</p>
<p id="4ccd" class="graf graf--p graf-after--p">This method should take a path to an image and a model checkpoint, then return the probabilities and classes.</p>
<pre id="10a6" class="graf graf--pre graf-after--p">img_path = 'flower_data/valid/18/image_04252.jpg'
probs, classes = predict2(img_path, model.to(device))
print(probs)
print(classes)
flower_names = [cat_to_name[class_names[e]] for e in classes]
print(flower_names)</pre>
<p id="ebc3" class="graf graf--p graf-after--pre">I was pretty pleased with how my model performed!</p>
<pre id="2e23" class="graf graf--pre graf-after--p">[0.9999195337295532, 1.4087702766119037e-05, 1.3897360986447893e-05, 1.1400215043977369e-05, 6.098791800468462e-06]
[12, 86, 7, 88, 40]
['peruvian lily', 'desert-rose', 'king protea', 'magnolia', 'sword lily']</pre>
<p id="a500" class="graf graf--p graf-after--pre">Basically, it’s nearly 100% likely that the image I specified is a Peruvian Lily. Want to take a look? Try using matplotlib to plot the probabilities for the top five classes in a bar graph along with the input image:</p>
<pre id="42ea" class="graf graf--pre graf-after--p">def view_classify(img_path, prob, classes, mapping):
    ''' Function for viewing an image and it's predicted classes.
    '''
    image = Image.open(img_path)</pre>
<pre id="9c07" class="graf graf--pre graf-after--pre">fig, (ax1, ax2) = plt.subplots(figsize=(6,10), ncols=1, nrows=2)
    flower_name = mapping[img_path.split('/')[-2]]
    ax1.set_title(flower_name)
    ax1.imshow(image)
    ax1.axis('off')
    
    y_pos = np.arange(len(prob))
    ax2.barh(y_pos, prob, align='center')
    ax2.set_yticks(y_pos)
    ax2.set_yticklabels(flower_names)
    ax2.invert_yaxis()  # labels read top-to-bottom
    ax2.set_title('Class Probability')</pre>
<pre id="dd7e" class="graf graf--pre graf-after--pre">view_classify(img_path, probs, classes, cat_to_name)</pre>
<p id="c99b" class="graf graf--p graf-after--pre">You should see something like this:</p>
<figure id="d947" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/4d587-87d41-1eu4tgmafpzettvs7ltmouq.png?w=1080"></div>
</div>
</figure>
<p id="af53" class="graf graf--p graf-after--figure">I’ve got to say, I’m pretty happy with that! I recommend testing a few other images to see how close your predictions are on a variety of images.</p>
<figure id="7030" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/07204-7eade-1atepr7c3gu_epgxeo_ryzq.png?w=1080"></div>
</div>
</figure>
<p id="c286" class="graf graf--p graf-after--figure">Now it’s time to make an image classifier of your own! Let me know how it goes in the responses below.</p>
<figure id="5803" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill">&nbsp;</div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*QVddlYxKEf5UrKFQ"></div>
</div>
<figcaption class="imageCaption">Photo by&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@pezgonzalez?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator nofollow noopener noreferrer">Pez González</a>&nbsp;on&nbsp;<a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source nofollow noopener noreferrer">Unsplash</a></figcaption>
</figure>
<p id="65c8" class="graf graf--p graf-after--figure">Have you finished your deep learning or machine learning model, but you don’t know what to do with it next? Why not deploy it to the internet?</p>
<p id="6da0" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Get your model out there so everyone can see it!</strong></p>
<p id="e91f" class="graf graf--p graf-after--p graf--trailing"><a class="markup--anchor markup--p-anchor" href="https://heartbeat.fritz.ai/brilliant-beginners-guide-to-model-deployment-133e158f6717" target="_blank" rel="noopener noreferrer">Check out this article to learn how to deploy your machine learning model with Flask</a>!</p>
</div>


<p class="wp-block-paragraph">Thank you for reading!</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/">How to build an image classifier with greater than 97% accuracy</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">198</post-id>	</item>
		<item>
		<title>Getting started with Google Colab</title>
		<link>https://contentsimplicity.com/getting-started-with-google-colab/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=getting-started-with-google-colab&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=getting-started-with-google-colab</link>
					<comments>https://contentsimplicity.com/getting-started-with-google-colab/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Mon, 08 Apr 2019 15:45:37 +0000</pubDate>
				<category><![CDATA[data]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[beginner]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[colab]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[free]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[setup]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=42</guid>

					<description><![CDATA[<p>Looking for free GPU? Google Colab has your back! Wondering how on earth to get it to work? You’re in the right place! Let me show you...</p>
<p>The post <a href="https://contentsimplicity.com/getting-started-with-google-colab/">Getting started with Google Colab</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h3 class="wp-block-heading" id="9a5d">A Simple Tutorial for the Frustrated and&nbsp;Confused</h3>


<p><em>Photo by <a href="https://unsplash.com/@hhh13?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noreferrer noopener">FuYong Hua</a> on <a href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="noreferrer noopener">Unsplash</a></em></p>
<p><a href="https://towardsdatascience.com/getting-started-with-google-colab-f2fff97f594c"><em>(This article first appeared on Towards Data Science)</em></a></p>
<p id="2ab4" class="graf graf--p graf-after--h4"><strong>You know it’s out there. You know there’s free GPU somewhere, hanging like a fat, juicy, ripe blackberry on a branch just slightly<span class="markup--quote markup--p-quote is-other" style="font-size: inherit;"> </span><span style="font-size: inherit;">out of reach.</span></strong></p>
<p id="3c41" class="graf graf--p graf-after--p">Beautiful lightning-fast speed waiting just for you.</p>
<p id="9569" class="graf graf--p graf-after--p">Wondering how on earth to get it to work? You’re in the right place!</p>
<figure id="97d4" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*RCOiZb0ZDKCJ8cKX" width="1600" height="1066" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@brenomachado?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Breno Machado</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="ec9a" class="graf graf--p graf-after--figure">For anyone who doesn’t already know, Google has done the coolest thing ever by providing a free cloud service based on Jupyter Notebooks that supports free GPU. Not only is this a great tool for improving your coding skills, but it also allows absolutely anyone to develop deep learning applications using popular libraries such as <strong class="markup--strong markup--p-strong">PyTorch</strong>, <strong class="markup--strong markup--p-strong">TensorFlow</strong>, <strong class="markup--strong markup--p-strong">Keras, </strong>and <strong class="markup--strong markup--p-strong">OpenCV.</strong></p>
<blockquote id="c25a" class="graf graf--pullquote graf-after--p">
<p><span class="markup--quote markup--pullquote-quote is-other">Colab provides GPU and it’s <strong class="markup--strong markup--pullquote-strong">totally free</strong>. Seriously!</span></p>
</blockquote>
<p id="6e6e" class="graf graf--p graf-after--pullquote">There are, of course, limits. (Nitty gritty details are available on their <a class="markup--anchor markup--p-anchor" href="https://research.google.com/colaboratory/faq.html" target="_blank" rel="noopener noreferrer">faq page</a>, of course.) It supports <strong class="markup--strong markup--p-strong">Python 2.7</strong> and<strong class="markup--strong markup--p-strong"> 3.6</strong>, but not <strong class="markup--strong markup--p-strong">R</strong> or <strong class="markup--strong markup--p-strong">Scala</strong> yet. There is a limit to your sessions and size, but you can definitely get around that if you’re creative and don’t mind occasionally re-uploading your files…</p>
<p id="dbc4" class="graf graf--p graf-after--p">Colab is ideal for everything from improving your Python coding skills to working with deep learning libraries, like <strong class="markup--strong markup--p-strong">PyTorch</strong>, <strong class="markup--strong markup--p-strong">Keras</strong>, <strong class="markup--strong markup--p-strong">TensorFlow</strong>, and <strong class="markup--strong markup--p-strong">OpenCV</strong>. You can create notebooks in Colab, upload notebooks, store notebooks, share notebooks, mount your Google Drive and use whatever you’ve got stored in there, import most of your favorite directories, upload your personal Jupyter Notebooks, upload notebooks directly from GitHub, upload Kaggle files, download your notebooks, and do just about everything else that you might want to be <span class="markup--quote markup--p-quote is-other">able</span> to do.</p>
<p id="1aae" class="graf graf--p graf-after--p">It’s awesome.</p>
<p id="f067" class="graf graf--p graf-after--p">Working in Google Colab for the first time has been totally phenomenal and pretty shockingly easy, but it hasn’t been without a couple of small challenges! If you know Jupyter Notebooks at all, you’re pretty much good to go in Google Colab, but there are just a few little differences that can make the difference between flying off to freedom on the wings of free GPU and sitting at your computer, banging your head against the wall…</p>
<figure id="69da" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*n47GOQJCB6Rj84Vk" width="1600" height="1067" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@gmat07?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator nofollow noopener noreferrer">Gabriel Matula</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source nofollow noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="a555" class="graf graf--p graf-after--figure">This article is for anyone out there who is confused, frustrated, and just wants this thing to work!</p>
<h4 id="11a4" class="graf graf--h4 graf-after--p">Setting up your drive</h4>
<p id="106c" class="graf graf--p graf-after--h4"><strong class="markup--strong markup--p-strong">Create a folder for your notebooks</strong></p>
<p id="21aa" class="graf graf--p graf-after--p">(Technically speaking, this step isn’t totally necessary if you want to just start working in Colab. However, since Colab is working off of your drive, it’s not a bad idea to specify the folder where you want to work. You can do that by going to your <a class="markup--anchor markup--p-anchor" href="https://drive.google.com/" target="_blank" rel="noopener noreferrer">Google Drive</a> and clicking “New” and then creating a new folder. I only mention this because my Google Drive is embarrassingly littered with what looks like a million scattered Colab notebooks and now I’m going to have to deal with that.)</p>
<figure id="d292" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/f38c2-ec2c3-1rp17m-6cbfynldko2v4jmg.png?w=1080" /></div>
</div>
</figure>
<p id="2647" class="graf graf--p graf-after--figure">If you want, while you’re already in your Google Drive you can create a new Colab notebook. Just click “New” and drop the menu down to “More” and then select “Colaboratory.”</p>
<figure id="1f3a" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/369ef-d0a1d-1sjfggypnyhpafndbna7edg.png?w=1080" /></div>
</div>
</figure>
<p id="3dc2" class="graf graf--p graf-after--figure">Otherwise, you can always go directly to <a class="markup--anchor markup--p-anchor" href="https://colab.research.google.com/" target="_blank" rel="noopener noreferrer">Google Colab</a>.</p>
<h4 id="edbc" class="graf graf--h4 graf-after--p">Game on!</h4>
<p id="4202" class="graf graf--p graf-after--h4">You can rename your notebook by clicking on the name of the notebook and changing it or by dropping the “File” menu down to “Rename.”</p>
<figure id="e0da" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/727a1-d1a99-1lziarbuysu5zpwdawudnbq.png?w=1080" /></div>
</div>
</figure>
<h4 id="fa2d" class="graf graf--h4 graf-after--figure">Set up your free GPU</h4>
<p id="8d33" class="graf graf--p graf-after--h4">Want to use GPU? It’s as simple as going to the “runtime” dropdown menu, selecting “change runtime type” and selecting GPU in the hardware accelerator drop-down menu!</p>
<figure id="82a3" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/3259a-34841-17du2-qmemro5nvxnepnukq.png?w=1080" /></div>
</div>
</figure>
<h4 id="3f4e" class="graf graf--h4 graf-after--figure">Get coding!</h4>
<p id="5417" class="graf graf--p graf-after--h4">You can easily start running code now if you want! You are good to go!</p>
<h4 id="0daa" class="graf graf--h4 graf-after--p">Make it better</h4>
<p id="337d" class="graf graf--p graf-after--h4">Want to mount your Google Drive? Use:</p>
<pre id="a062" class="graf graf--pre graf-after--p"><span class="markup--quote markup--pre-quote is-other">from google.colab import drive
drive.mount('/content/gdrive')</span></pre>
<p id="b4dc" class="graf graf--p graf-after--pre">Then you’ll see a link, click on that, allow access, copy the code that pops up, paste it in the box, hit enter, and you’re good to go! If you don’t see your drive in the side box on the left, just hit “refresh” and it should show up.</p>
<p id="f763" class="graf graf--p graf-after--p">(Run the cell, click the link, copy the code on the page, paste it in the box, hit enter, and you’ll see this when you’ve successfully mounted your drive):</p>
<figure id="c927" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/032f4-ca043-1vrrlothfotpy_2rutzv3yg.png?w=1080" /></div>
</div>
</figure>
<p id="9193" class="graf graf--p graf-after--figure">Now you can see your drive right there on the left-hand side of the screen! (You may need to hit “refresh.”) Plus, you can reach your drive any time with</p>
<pre id="1bf7" class="graf graf--pre graf-after--p">!ls "/content/gdrive/My Drive/"</pre>
<p id="cad4" class="graf graf--p graf-after--pre">If you’d rather download a shared zip file link, you can use:</p>
<pre id="1d29" class="graf graf--pre graf-after--p">!wget 
!unzip</pre>
<p id="3d7c" class="graf graf--p graf-after--pre">For example:</p>
<pre id="fa6d" class="graf graf--pre graf-after--p">!wget -cq <a class="markup--anchor markup--pre-anchor" href="https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip" target="_blank" rel="nofollow noopener noreferrer">https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip</a>
!unzip -qq flower_data.zip</pre>
<p id="fa0a" class="graf graf--p graf-after--pre">That will give you Udacity’s flower data set in seconds!</p>
<p id="bef3" class="graf graf--p graf-after--p">If you’re uploading small files, you can just upload them directly with some simple code. However, if you want to, you can also just go to the left side of the screen and click “upload files” if you don’t feel like running some simple code to grab a local file.</p>
<figure id="ac8f" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/8b416-70f77-11lcilc2o0gyzdup_yl260a.png?w=1080" /></div>
</div>
</figure>
<p id="8c61" class="graf graf--p graf-after--figure">Google Colab is incredibly easy to use on pretty much every level, especially if you’re at all familiar with Jupyter Notebooks. However, grabbing some large files and getting a couple of specific directories to work did trip me up for a minute or two.</p>
<p id="a5eb" class="graf graf--p graf-after--p">I covered <a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/setting-up-kaggle-in-google-colab-ebb281b61463" target="_blank" rel="noopener noreferrer">getting started with Kaggle in Google Colab</a> in a separate article, so if that’s what interests you, please check that out!</p>
<h4 id="baaf" class="graf graf--h4 graf-after--p">Importing libraries</h4>
<p id="9162" class="graf graf--p graf-after--h4">Imports are pretty standard, with a few exceptions.</p>
<p id="d655" class="graf graf--p graf-after--p">For the most part, you can import your libraries by running <code class="markup--code markup--p-code">import</code> like you do in any other notebook.</p>
<figure id="2df1" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/3c86a-344c2-1osn4gwhc32224s7nvyr86a.png?w=1080" /></div>
</div>
</figure>
<p id="8885" class="graf graf--p graf-after--figure"><span class="markup--quote markup--p-quote is-other">PyTorch is different!</span><span class="markup--quote markup--p-quote is-other"> </span><span class="markup--quote markup--p-quote is-other">Before you run any other Torch imports, you’ll want to run</span></p>
<p id="505a" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other"><strong class="markup--strong markup--p-strong">*** UPDATE! (01/29)*** Colab now supports native PyTorch!!! You shouldn’t need to run the code below, but I’m leaving it up just in case anyone is having any issues!</strong></span></p>
<pre id="0d49" class="graf graf--pre graf-after--p">!pip install -q <a class="markup--anchor markup--pre-anchor" href="http://download.pytorch.org/whl/%7Baccelerator%7D/torch-0.4.1-%7Bplatform%7D-linux_x86_64.whl" target="_blank" rel="nofollow noopener noreferrer">http://download.pytorch.org/whl/{accelerator}/torch-0.4.1-{platform}-linux_x86_64.whl</a> torchvision
import torch</pre>
<p id="13a4" class="graf graf--p graf-after--pre">Then you can continue with your imports. If you try to simply run <code class="markup--code markup--p-code">import torch</code> you’ll get an error message. I really recommend clicking on the extremely helpful links that pop up. <span class="markup--quote markup--p-quote is-other">If you do, you’ll get that code right away and you can just click on “INSTALL TORCH” to import it into your notebook. The code will pop up on the left-hand side of your screen, and then hit “INSERT.”</span></p>
<figure id="9e16" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/aea5d-79277-19mne-yxkfpqova2h0gyemq.png?w=1080" /></div>
</div>
</figure>
<figure id="435d" class="graf graf--figure graf-after--figure">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/78d46-48732-1zohcgr_9ndahw9cberjkba.png?w=1080" /></div>
</div>
</figure>
<p id="9594" class="graf graf--p graf-after--figure"><span class="markup--quote markup--p-quote is-other">Not able to simply import something else that you want with an import statement? Try a pip install! Just be aware that Google Colab wants an exclamation point before most commands.</span></p>
<pre id="5f8b" class="graf graf--pre graf-after--p">!pip install -q keras
import keras</pre>
<p id="3bb1" class="graf graf--p graf-after--pre">or:</p>
<pre id="80ed" class="graf graf--pre graf-after--p"><code class="markup--code markup--pre-code">!pip3 install torch torchvision</code></pre>
<p id="47cb" class="graf graf--p graf-after--pre">and:</p>
<pre id="ddef" class="graf graf--pre graf-after--p">!apt-get install</pre>
<p id="2a3c" class="graf graf--p graf-after--pre">is useful too!</p>
<p id="9e3c" class="graf graf--p graf-after--p">I did find that Pillow can be sort of buggy, but you can solve that by running</p>
<pre id="936d" class="graf graf--pre graf-after--p">import PIL
print(PIL.PILLOW_VERSION)</pre>
<p id="7bbe" class="graf graf--p graf-after--pre">If you get anything below 5.3, go to the “runtime” dropdown menu, restart the runtime, and run the cell again. You should be good to go!</p>
<p id="dc3d" class="graf graf--p graf-after--p">It’s easy to create a new notebook by dropping “File” down to “New Python 3 Notebook.” If you want to open something specific, drop the “File” menu down to “Open Notebook…”</p>
<figure id="87a3" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/143ea-8e6ad-135pw2eysentadqpyxzs0zg.png?w=1080" /></div>
</div>
</figure>
<p id="b448" class="graf graf--p graf-after--figure">Then you’ll see a screen that looks like this:</p>
<figure id="7ca4" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/62ec2-1f2c5-1yxhvdisr6pnkck8vgjkmca.png?w=1080" /></div>
</div>
</figure>
<p id="a8ef" class="graf graf--p graf-after--figure">As you can see, you can open a recent file, files from your Google Drive, GitHub files, and you can upload a notebook right there as well.</p>
<p id="b76a" class="graf graf--p graf-after--p">The GitHub option is great! You can easily search by an organization or user to find files. If you don’t see what you’re looking for, try checking the repository drop-down menu!</p>
<figure id="5cd5" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/412ec-77219-1hekxmtjxecubfccca44fuq.png?w=1080" /></div>
</div>
</figure>
<figure id="932b" class="graf graf--figure graf-after--figure">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/b8d0e-1efb4-11nyifu7rpupli02swpks2w.png?w=1080" /></div>
</div>
</figure>
<figure id="7846" class="graf graf--figure graf-after--figure">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/bcd50-9b66c-17ofskko-iqcsyl2ilglelq.png?w=1080" /></div>
</div>
</figure>
<h4 id="e584" class="graf graf--h4 graf-after--figure">Always be saving</h4>
<p id="d5f7" class="graf graf--p graf-after--h4">Saving your work is simple! You can do a good ol’ “command-s” or drop the “File” menu down to save. You can create a copy of your notebook by dropping “File” -&gt; “Save a Copy in Drive.” You can also download your workbook by going from “File” -&gt; “download .ipyb” or “download .py.”</p>
<figure id="008d" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/8915a-38723-1yi4fxwrtbfkjt-mtxzrqjg.png?w=1080" /></div>
</div>
</figure>
<p id="e783" class="graf graf--p graf-after--figure">That should be enough to at least get you up and running on Colab and taking advantage of that sweet, sweet free GPU! Please let me know if you run into any other newbie problems that I might be able to help you with. I’d love to help you if I can!</p>
<p id="22bf" class="graf graf--p graf-after--p">If you’re just getting started with machine learning and AI, I have a few other articles you might want to check out:</p>
<ul class="postList">
<li style="list-style-type: none;">
<ul class="postList">
<li id="80ab" class="graf graf--li graf-after--p"><a class="markup--anchor markup--li-anchor" href="https://contentsimplicity.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing/" target="_blank" rel="noopener noreferrer">The complete beginner’s guide to data cleaning and preprocessing</a></li>
<li><a href="https://contentsimplicity.com/how-to-create-a-free-portfolio/" target="_blank" rel="noopener noreferrer">How to Create a Totally Free Portfolio or Website</a></li>
<li class="graf graf--li graf-after--p"><a class="markup--anchor markup--li-anchor" href="https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb" target="_blank" rel="noopener noreferrer">WTF is Image Classification?</a></li>
<li class="graf graf--li graf-after--p"><a class="markup--anchor markup--li-anchor" href="https://contentsimplicity.com/intro-to-deep-learning/" target="_blank" rel="noopener noreferrer">Intro to Deep Learning</a></li>
<li class="graf graf--li graf-after--p"><a class="markup--anchor markup--li-anchor" href="https://contentsimplicity.com/how-to-build-an-image-classifier-with-greater-than-97-accuracy/" target="_blank" rel="noopener noreferrer">How to build an image classifier with greater than 97% accuracy</a></li>
</ul>
</li>
</ul>
<ul class="postList">
<li style="list-style-type: none;"> </li>
</ul>
<p id="393e" class="graf graf--p graf-after--li">Have you finished your deep learning or machine learning model, but you don’t know what to do with it next? Why not deploy it to the internet?</p>
<p id="e485" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Get your model out there so everyone can see it!</strong></p>
<ul class="postList">
<li id="fbf9" class="graf graf--li graf-after--p"><a class="markup--anchor markup--li-anchor" href="https://heartbeat.fritz.ai/brilliant-beginners-guide-to-model-deployment-133e158f6717" target="_blank" rel="noopener noreferrer">Check out this article to learn how to deploy your machine learning model with Flask</a>!</li>
</ul>
<figure id="bb3f" class="graf graf--figure graf-after--li graf--trailing">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*94DRzQRyfBeYHo9a" width="1600" height="1066" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@scaitlin82?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator nofollow noopener noreferrer">Sarah Cervantes</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source nofollow noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>


<p class="wp-block-paragraph">Thanks for reading!</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/getting-started-with-google-colab/">Getting started with Google Colab</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/getting-started-with-google-colab/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">42</post-id>	</item>
		<item>
		<title>Data cleaning and preprocessing for beginners</title>
		<link>https://contentsimplicity.com/data-cleaning-and-preprocessing/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-cleaning-and-preprocessing&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-cleaning-and-preprocessing</link>
					<comments>https://contentsimplicity.com/data-cleaning-and-preprocessing/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Wed, 03 Apr 2019 09:25:24 +0000</pubDate>
				<category><![CDATA[data]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[beginner]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[data cleaning]]></category>
		<category><![CDATA[data preprocessing]]></category>
		<category><![CDATA[featured]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=140</guid>

					<description><![CDATA[<p>If your data hasn’t been cleaned and preprocessed, your model does not work. Data preprocessing is the first step toward building a machine learning model.</p>
<p>The post <a href="https://contentsimplicity.com/data-cleaning-and-preprocessing/">Data cleaning and preprocessing for beginners</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph"><em>Picture via Pixabay <a href="http://pixabay.com/">http://pixabay.com</a></em></p>



<h4 class="wp-block-heading" id="c2d4">How to successfully prepare your data for a machine learning model in minutes</h4>


<div class="postArticle-content js-postField js-notesSource js-trackPostScrolls">
<section class="section section--body section--first section--last">
<div class="section-content">
<div class="section-inner sectionLayout--insetColumn">
<p><em>(<a href="https://towardsdatascience.com/the-complete-beginners-guide-to-data-cleaning-and-preprocessing-2070b7d4c6d" target="_blank" rel="noopener noreferrer">This article first appeared in Towards Data Science)</a></em></p>
<p id="9696" class="graf graf--p graf-after--figure">Data cleaning and preprocessing is the first (and arguably most important) step toward building a working machine learning model. It’s critical!</p>
<p id="19fd" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other"><strong class="markup--strong markup--p-strong">If your data hasn’t been cleaned and preprocessed, your model does not work.</strong></span></p>
<p id="9cf2" class="graf graf--p graf-after--p">It’s that simple.</p>
<p id="9b54" class="graf graf--p graf-after--p">Data cleaning is generally thought of as the boring part. But it’s the difference between being prepared and being completely unprepared. It’s the difference between looking like a pro and looking pretty foolish.</p>
<p id="ba5f" class="graf graf--p graf-after--p">It’s kind of like getting ready for a vacation. You might not like the preparation part, but tightening down the details in advance can save you from one nightmare of a trip.</p>
<p id="37de" class="graf graf--p graf-after--p">You just have to do it or you can’t start having fun.</p>
<p id="607a" class="graf graf--p graf-after--figure">But how do you do it?</p>
<blockquote id="0975" class="graf graf--pullquote graf-after--p">
<p><strong class="markup--strong markup--pullquote-strong">This tutorial walks you through the basics of preparing any dataset for any machine learning model.</strong></p>
</blockquote>
<h3 id="921a" class="graf graf--h4 graf-after--pullquote">Imports first!</h3>
<p id="bd68" class="graf graf--p graf-after--h4">We want to start the data cleaning process by importing the libraries that you’ll need to preprocess your data. A library is really just a tool that you can use. You give the library the input, the library does its job, and it gives you the output you need. There are tons of libraries available, but three are essential libraries in Python. You’ll pretty much wind up using them every time. The three most popular libraries when you’re working with Python are Numpy, Matplotlib, and Pandas. <strong class="markup--strong markup--p-strong">Numpy</strong> is the library you’ll need for all things mathematical. Since your code is going to run on math, you’re going to use this one. <strong class="markup--strong markup--p-strong">Matplotlib</strong> (specifically Matplotlib.pyplot) is the library you’ll want if you’re going to make charts. <strong class="markup--strong markup--p-strong">Pandas</strong> is the best tool available for importing and managing datasets. Pandas and Numpy are basically essential for data preprocessing.</p>
<p id="210b" class="graf graf--p graf-after--p">It makes the most sense to import these libraries with a shortcut alias so that you can save a little time later. That’s simple and you can do it like this:</p>
<pre id="9ac8" class="graf graf--pre graf-after--p">import numpy as np
import matplotlib.pyplot as plt
import pandas as pd</pre>
<p id="ff19" class="graf graf--p graf-after--pre">Now you can read in your dataset by typing</p>
<pre id="b14e" class="graf graf--pre graf-after--p">dataset = pd.read_csv('my_data.csv')</pre>
<p id="7166" class="graf graf--p graf-after--pre">This tells Pandas (pd) to read in your dataset. These are the first few lines of the dataset I put together for this tutorial:</p>
<figure id="3599" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/b8820-4b8f2-1p6vrcgpm1mi6ryu7wzpehg.png?w=1080" /></div>
</div>
</figure>
<p id="adc9" class="graf graf--p graf-after--figure"><span class="markup--quote markup--p-quote is-other">Now we have our dataset, but we need to create a matrix of dependent variables and a vector of independent variables. You can create the matrix of dependent variables by typing:</span></p>
<pre id="49ab" class="graf graf--pre graf-after--p">X = dataset.iloc[:, :-1].values</pre>
<p id="b6a1" class="graf graf--p graf-after--pre">That first colon (<strong class="markup--strong markup--p-strong">:</strong>)means that we want to grab all of the lines in our dataset. <strong class="markup--strong markup--p-strong">:-1</strong> means that we want to grab all of the columns of data except the last column. The .<strong class="markup--strong markup--p-strong">values</strong> on the end means that we want to grab all of the values.</p>
<p id="65da" class="graf graf--p graf-after--p">Now we want a vector of dependent variable with only the data from the last column, so we can type</p>
<pre id="64f8" class="graf graf--pre graf-after--p">y = dataset.iloc[:, 3].values</pre>
<p id="327a" class="graf graf--p graf-after--pre">Remember when you’re looking at your dataset, the index starts at 0. If you’re trying to count the columns, start counting at 0, not 1. <strong class="markup--strong markup--p-strong">[:, 3]</strong> gets you the <strong class="markup--strong markup--p-strong">animal</strong>, <strong class="markup--strong markup--p-strong">age</strong>, and <strong class="markup--strong markup--p-strong">worth </strong>columns. 0 is the animal column, 1 is the age column, and 2 is the worth. You will get used to this counting system if you aren’t already!</p>
<h3 id="3c34" class="graf graf--h4 graf-after--p">What happens if we have missing data?</h3>
<p id="05ac" class="graf graf--p graf-after--h4">This actually happens all the time.</p>
<p id="57b8" class="graf graf--p graf-after--figure">We could just remove the lines where data are missing, but that’s a really not the smartest idea. That could easily cause problems. We need to find a better idea! The most common solution is to take the mean of the columns to fill in the missing data point.</p>
<p id="c5e2" class="graf graf--p graf-after--p">You can easily do this with the imputer class from scikit-learn’s preprocessing model. If you don’t know about it already, <a class="markup--anchor markup--p-anchor" href="https://scikit-learn.org/" target="_blank" rel="noopener noreferrer">scikit-learn</a> contains amazing machine learning models and I strongly suggest you check it out!)</p>
<p id="b5eb" class="graf graf--p graf-after--p">You might not be comfortable with terms like “<strong class="markup--strong markup--p-strong">method</strong>,” “<strong class="markup--strong markup--p-strong">class</strong>,” and “<strong class="markup--strong markup--p-strong">object</strong>” as they apply to machine learning. Not a problem!</p>
<p id="31cb" class="graf graf--p graf-after--p">A <strong class="markup--strong markup--p-strong">class</strong> is the model of something that we want to build. If we’re going to build a shed, the construction plan for the shed is the class.</p>
<p id="6df6" class="graf graf--p graf-after--p">An <strong class="markup--strong markup--p-strong">object</strong> is an instance of the class. The object in this example is the shed we built by following the construction plan. There can be many objects of the same class. That’s like saying that you can make lots of sheds from the construction plan.</p>
<p id="094f" class="graf graf--p graf-after--p">A <strong class="markup--strong markup--p-strong">method</strong> is a tool that we can use on the object, or a function that’s applied to the object that takes some inputs and returns some output. This is like a handle that we can use to open the window when our shed is starting to get a little stuffy.</p>
<figure id="2695" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*9tNbMFo4CSApIumF" width="1600" height="1080" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@romankraft?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Roman Kraft</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="ceea" class="graf graf--p graf-after--figure">To use the imputer, we would run something like this</p>
<pre id="bfa4" class="graf graf--pre graf-after--p">from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = np.nan, strategy = ‘mean’, axis = 0)</pre>
<p id="5572" class="graf graf--p graf-after--pre">Mean is the default strategy, so you don’t actually need to specify that, but it’s here so you can get a sense of what information you want to include. The default values for missing_values is nan. If your data set has missing values that are called “NaN,” you‘ll stick with np.nan. <a class="markup--anchor markup--p-anchor" href="https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html" target="_blank" rel="noopener noreferrer">Check out the official documentation here</a>!</p>
<p id="7b39" class="graf graf--p graf-after--p">Now to fit this imputer, we type</p>
<pre id="decf" class="graf graf--pre graf-after--p">imputer = imputer.fit(X[:, 1:3])</pre>
<p id="c88f" class="graf graf--p graf-after--pre">We only want to fit the imputer to the columns where data are missing. The first colon means that we want to include all of the lines, while <strong class="markup--strong markup--p-strong">1:3</strong> means that we’re taking column indexes 1 and 2. Don’t worry. You’ll get used to the way Python counts in no time!</p>
<p id="1eb9" class="graf graf--p graf-after--p">Now we want to use the method that will actually replace the missing data. You’ll set that up by typing</p>
<pre id="61b9" class="graf graf--pre graf-after--p">X[:, 1:3] = imputer.transform(X[:, 1:3])</pre>
<figure id="ab95" class="graf graf--figure graf-after--pre">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/92171-f86fc-1xwa1n4kvuobc_tfxkw31bg.png?w=1080" /></div>
</div>
</figure>
<p id="33a4" class="graf graf--p graf-after--figure">Try this out with other strategies! You might find that it makes more sense for your project to fill in the missing values with the median of the column. Or the mode! Decisions like these seem small, but they actually hold a lot of importance.</p>
<p id="bef3" class="graf graf--p graf-after--p">Just because something is popular doesn’t necessarily make it the right choice. The average (mean) of your data points isn’t necessarily the best choice for your model.</p>
<p id="73e7" class="graf graf--p graf-after--p">After all, nearly everyone reading this article has an above average number of arms…</p>
<figure id="ca57" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*5lpdsnzBBNF7Y2KA" width="1600" height="1066" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@matthewhenry?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Matthew Henry</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<h3 id="7c26" class="graf graf--h4 graf-after--figure">What if you have categorical data?</h3>
<p id="e9d4" class="graf graf--p graf-after--h4">Great question! You can’t exactly take the mean of <strong class="markup--strong markup--p-strong">cat</strong>, <strong class="markup--strong markup--p-strong">dog</strong>, and <strong class="markup--strong markup--p-strong">moose</strong>. What can we do? We can encode the categorical values as numbers! You’ll want to grab the Label Encoder class from sklearn.preprocessing.</p>
<p id="0df5" class="graf graf--p graf-after--p">Start with one column where you want to encode the data and call the label encoder. Then fit it onto your data</p>
<pre id="9a73" class="graf graf--pre graf-after--p">from sklearn.preprocessing import LabelEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])</pre>
<p id="aa7f" class="graf graf--p graf-after--pre">(Remember how the numbers in the brackets work? : means that we want to work with all of the lines and 0 means that we want to grab the first column.)</p>
<p id="d8cf" class="graf graf--p graf-after--p">That’s all it takes to replace the categorical variables in your first column with numbers. For example, instead of moose, you’ll have “0,” instead of “dog” you’ll have “2,” and instead of “cat,” you’d have “3.”</p>
<p id="90c5" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Do you see the potential problem?</strong></p>
<p id="9e26" class="graf graf--p graf-after--p">That system of labeling implies a hierarchical value to the data that could affect your model. 3 has a higher value than 0, but <strong class="markup--strong markup--p-strong">cat</strong> is not (necessarily…) greater than <strong class="markup--strong markup--p-strong">moose</strong>.</p>
<figure id="1667" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*RXDLfUCOCtWG82Qs" width="1600" height="1066" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@cellisboa?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Cel Lisboa</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="59f7" class="graf graf--p graf-after--figure">We need to create <strong class="markup--strong markup--p-strong">dummy variables</strong>! Dummy variables are an awesome option for data cleaning and preprocessing.</p>
<p id="a7a7" class="graf graf--p graf-after--p">We can create one column for cat, one for moose, and so on. Then we’ll fill the columns in with 1s and 0s (think 1=yes and 0=no.) That means that if you had <strong class="markup--strong markup--p-strong">cat</strong> in your original column, now you’d have a 0 in the moose column, a 0 in the dog column, and a 1 in the cat column.</p>
<p id="44d5" class="graf graf--p graf-after--p">That sounds complicated. Enter One Hot Encoder!</p>
<p id="ce82" class="graf graf--p graf-after--p">Import the encoder and then specify the index of the column</p>
<pre id="1222" class="graf graf--pre graf-after--p">from sklearn.preprocessing import OneHotEncoder
onehotencoder = OneHotEncoder(categorical_features = [0])</pre>
<p id="1879" class="graf graf--p graf-after--pre">Now a little fit and transform</p>
<pre id="19ca" class="graf graf--pre graf-after--p">X = onehotencoder.fit_transform(X).toarray()</pre>
<p id="4473" class="graf graf--p graf-after--pre">Voila! Your single column has been replaced by one column for each of the categorical variables that you had in your original column and it has 1s and 0s replacing the categorical variables.</p>
<p id="8072" class="graf graf--p graf-after--p">Pretty sweet, right?</p>
<p id="28c6" class="graf graf--p graf-after--p">We can go ahead and use label encoder for our <strong class="markup--strong markup--p-strong">y</strong> column if we have categorical variables like “yes” and “no.”</p>
<pre id="c7d3" class="graf graf--pre graf-after--p">labelencoder_y = LabelEncoder()
y = labelencoder_y.fit_transform(y)</pre>
<p id="4248" class="graf graf--p graf-after--pre">This will go ahead and fit and transform y into an encoded variable with 1 for yes and 0 for no.</p>
<figure id="8a3d" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/9bd43-f0298-1tpsvsf1gripfakrhlsnr2a.png?w=1080" /></div>
</div>
</figure>
<h3 id="631c" class="graf graf--h4 graf-after--figure">Train test split</h3>
<p id="0e3b" class="graf graf--p graf-after--h4">At this point, you can go ahead and split your data into training and testing sets. <a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb" target="_blank" rel="noopener noreferrer">I know I already said this in the image classification tutorial</a>, but always separate your data into training and testing sets and <strong class="markup--strong markup--p-strong">never</strong> use your testing data for training! You need to avoid overfitting. (You can think of overfitting like memorizing super specific details before a test without understanding the information. When you memorize details, you’ll do a great job with your flashcards at home. You’ll fail any real test, though, where you’re presented with new information.)</p>
<p id="6e1d" class="graf graf--p graf-after--p">Right now, we have a machine that needs to learn something. It needs to train on data and see how well it understands what it’s learned on separate data. Memorizing the training set is not the same thing as learning! The better your model learns on the training set, the better it will be at predicting the results for the testing set. You never want to overfit your model. You really want it to learn!</p>
<figure id="709e" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*Ic-G-LdQHkXym7vQ" width="1600" height="1934" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@thepootphotographer?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Janko Ferlič</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="b3bd" class="graf graf--p graf-after--figure">First, we import</p>
<pre id="dffb" class="graf graf--pre graf-after--p">from sklearn.model_selection import train_test_split</pre>
<p id="2765" class="graf graf--p graf-after--pre">Now we can create X_train and X_test and y_train and y_test sets.</p>
<pre id="8aec" class="graf graf--pre graf-after--p">X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)</pre>
<p id="ee8d" class="graf graf--p graf-after--pre">It’s very common to do an 80/20 split of your data, with 80% of your data going to training and 20% to testing. That’s why we specified a test_size of 0.2. You can split it however you need to. <span class="markup--quote markup--p-quote is-other">You don’t need to set a random state, but I like to do that so that we can exactly reproduce our results.</span></p>
<h3 id="044e" class="graf graf--h4 graf-after--p">Now for <span class="markup--quote markup--h4-quote is-other">feature scaling</span>.</h3>
<p id="f1a3" class="graf graf--p graf-after--h4">What is feature scaling? Why do we need it?</p>
<p id="6b7d" class="graf graf--p graf-after--p">Well, look at our data. We have one column with animal ages from 4–17 and we have animal worth that ranges from $48,000-$83,000. Not only is the worth column made up of much higher numbers than the age column, but the variables also cover a much wider range of data. That means that the Euclidean distance will be dominated by <strong class="markup--strong markup--p-strong">worth</strong> and will wind up dominating the <strong class="markup--strong markup--p-strong">age</strong> data.</p>
<p id="846d" class="graf graf--p graf-after--p">What if Euclidean distance doesn’t play a part in your specific machine learning model? <span class="markup--quote markup--p-quote is-other">Scaling the features will still make the model much faster, so you might want to include this step when you’re preprocessing your data</span>.</p>
<p id="98d4" class="graf graf--p graf-after--p">There are many ways to do feature scaling. They all mean that we’re putting all of our features into the same scale so that none are dominated by another.</p>
<p id="165b" class="graf graf--p graf-after--p">Start with the import (you must be getting used to that)</p>
<pre id="a3d1" class="graf graf--pre graf-after--p">from sklearn.preprocessing import StandardScaler</pre>
<p id="d96e" class="graf graf--p graf-after--pre">Then create an object that we’ll scale and call the standard scaler</p>
<pre id="4239" class="graf graf--pre graf-after--p">sc_X = StandardScaler()</pre>
<p id="e94c" class="graf graf--p graf-after--pre">Now we directly fit and transform our dataset. Grab the object and apply the methods.</p>
<pre id="8a52" class="graf graf--pre graf-after--p">X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)</pre>
<p id="f588" class="graf graf--p graf-after--pre">We don’t need to fit it to our test set, we just need a transform.</p>
<pre id="2c5b" class="graf graf--pre graf-after--p">sc_y = StandardScaler()
y_train = sc_y.fit_transform(y_train)</pre>
<h4 id="68ae" class="graf graf--h4 graf-after--pre"><strong class="markup--strong markup--h4-strong">What about the dummy variables? Do you need to scale them?</strong></h4>
<p id="ef0e" class="graf graf--p graf-after--h4">Well, some people say yes and some say no. It’s a question of how much you want to hang on to your interpretation. It is good to have all of our data at the same scale. But if we scale our data, we lose our ability to easily interpret which observations belong to which variable.</p>
<p id="91e6" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">What about <strong class="markup--strong markup--p-strong">y</strong>? If you have a dependent variable like 0 and 1, you really don’t need to apply feature scaling. It’s a classification problem with a categorically dependent value. But if you have a large range of feature values, then yes! You do want to apply the scaler!</span></p>
<h4 id="57ae" class="graf graf--h4 graf-after--p">You did it!</h4>
<p id="7afc" class="graf graf--p graf-after--h4">That’s it!</p>
<figure id="2040" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*UyfAS8-vfoe7EcoI" width="1600" height="1066" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@rovenimages_com?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Roven Images</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="e1d6" class="graf graf--p graf-after--figure">With just a handful of lines of code, you’ve taken care of the basics of data cleaning and preprocessing! <a class="markup--anchor markup--p-anchor" href="https://github.com/bonn0062/basic_data_preprocessing" target="_blank" rel="noopener noreferrer">You can see the code here</a> if want to take a look.</p>
<p id="3eae" class="graf graf--p graf-after--p">There will definitely be a ton of thought that you’ll need to put into this step. You want to think about exactly how you’re going to fill in your missing data. Consider whether you need to scale your features and how you want to do it. Dummy variables or no? Are you going to encode your data? Will you encode your dummy variables? There are a ton of details to consider here. Nobody said data cleaning would be easy!</p>
<p id="b92f" class="graf graf--p graf-after--p">That said, you’ve got this!</p>
<p id="e6b3" class="graf graf--p graf-after--p">Now get out there and get that data ready!</p>
<p id="2df7" class="graf graf--p graf-after--p">Are you curious about deep learning? You might want to take a look at <a class="markup--anchor markup--p-anchor" href="https://contentsimplicity.com/intro-to-deep-learning/" target="_blank" rel="noopener noreferrer">Intro to Deep Learning</a>!</p>
<p id="2611" class="graf graf--p graf-after--p">Need some free GPU, but not sure where to find it? Check out <a class="markup--anchor markup--p-anchor" href="https://contentsimplicity.com/getting-started-with-google-colab/" target="_blank" rel="noopener noreferrer">Getting Started with Google Colab</a>.</p>
<p id="9f33" class="graf graf--p graf-after--p">Have you already finished a machine learning model, but you don’t know what to do with it next? Why not deploy it to the internet?</p>
<p id="5a61" class="graf graf--p graf-after--p"><a class="markup--anchor markup--p-anchor" href="https://heartbeat.fritz.ai/brilliant-beginners-guide-to-model-deployment-133e158f6717" target="_blank" rel="noopener noreferrer">Check out this article to learn how to deploy your machine learning model with Flask</a>!</p>
<p id="ba76" class="graf graf--p graf-after--p graf--trailing">As always, if you’re doing anything cool with this information, let people know about it in the responses below or reach out any time on Twitter <a class="markup--anchor markup--p-anchor" href="https://twitter.com/annebonnerdata" target="_blank" rel="noopener noreferrer">@annebonnerdata</a>!</p>
</div>
</div>
</section>
</div>
<footer class="u-paddingTop10">
<div class="container u-maxWidth740">
<div class="row">
<div class="col u-size12of12"> </div>
</div>
</div>
</footer><span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/data-cleaning-and-preprocessing/">Data cleaning and preprocessing for beginners</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/data-cleaning-and-preprocessing/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">206</post-id>	</item>
		<item>
		<title>Simply deep learning: an effortless introduction</title>
		<link>https://contentsimplicity.com/simply-deep-learning-an-effortless-introduction/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=simply-deep-learning-an-effortless-introduction&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=simply-deep-learning-an-effortless-introduction</link>
					<comments>https://contentsimplicity.com/simply-deep-learning-an-effortless-introduction/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Mon, 04 Mar 2019 04:13:44 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Programming]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=100</guid>

					<description><![CDATA[<p>This article first appeared on Towards Data Science. What is an artificial neural network, how does it work, and what does it have to do with deep learning? Let’s start with a quick recap from part 1 for anyone who hasn’t looked at it: What is deep learning? It’s learning from examples. That’s pretty much the deal. [&#8230;]</p>
<p>The post <a href="https://contentsimplicity.com/simply-deep-learning-an-effortless-introduction/">Simply deep learning: an effortless introduction</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p id="6717" class="graf graf--p graf-after--h4"><em class="markup--em markup--p-em">This article</em><em class="markup--em markup--p-em"> first appeared on <a href="https://towardsdatascience.com/simply-deep-learning-an-effortless-introduction-45591a1c4abb" target="_blank" rel="noopener noreferrer">Towards Data Science.</a></em></p>
<h4 id="0a0f" class="graf graf--h4 graf-after--figure">What is an artificial neural network, how does it work, and what does it have to do with deep learning?</h4>
<p id="c7a1" class="graf graf--p graf-after--h4">Let’s start with a quick recap from part 1 for anyone who hasn’t looked at it:</p>
<h4 id="5d2e" class="graf graf--h4 graf-after--p">What is deep learning?</h4>
<p id="496f" class="graf graf--p graf-after--h4">It’s <strong class="markup--strong markup--p-strong">learning from examples</strong>. That’s pretty much the deal.</p>
<p id="134c" class="graf graf--p graf-after--p">At a very basic level, deep learning is a machine learning technique. It teaches a computer to filter inputs through layers to learn how to predict and classify information. Observations can be in the form of images, text, or sound.</p>
<p id="c00c" class="graf graf--p graf-after--p">The inspiration for deep learning is the way that the human brain filters information. Its purpose is to mimic how the human brain works to create some real magic.</p>
<p id="bd57" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Deep learning attempts to mimic the activity in layers of neurons in the neocortex.</strong></p>
<p id="b15f" class="graf graf--p graf-after--p"><em class="markup--em markup--p-em">It’s very literally an artificial neural network</em>.</p>
<p id="c76a" class="graf graf--p graf-after--p">In the human brain, there are about 100 billion neurons. Each neuron connects to about 100,000 of its neighbors. That is what we’re trying to create, but in a way and at a level that works for machines.</p>
<figure id="d894" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/568bb-d327c-1g6zlsges6n04fwsu5gnncw.jpeg?resize=640%2C452" width="640" height="452" /><p class="wp-caption-text"><em>Image by geralt on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="3eae" class="graf graf--p graf-after--figure">What does this mean in terms of neurons, axons, dendrites, and so on? Well, the neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and transfers to the dendrites of the next neuron. That connection where the signal passes is called a synapse.</p>
<figure id="eff3" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 492px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/4fb07-fc247-10oxciq4kwvxpxezlwzhotq.png?resize=482%2C640" width="482" height="640" /><p class="wp-caption-text"><em>Image by mohamed_hassan on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="0e85" class="graf graf--p graf-after--figure">Neurons by themselves are kind of useless. But when you have lots of them, they work together to create some serious magic. That’s the idea behind a deep learning algorithm! You get input from observation and you put your input into one layer. That layer creates an output which in turn becomes the input for the next layer, and so on. This happens over and over until your final output signal!</p>
<p id="6ace" class="graf graf--p graf-after--p">So the neuron (or <strong class="markup--strong markup--p-strong">node</strong>) gets a signal or signals (<strong class="markup--strong markup--p-strong">input values</strong>), which pass through the neuron. That neuron delivers the <strong class="markup--strong markup--p-strong">output signal</strong>. Think of the input layer as your senses: the things you, for example, see, smell, and feel. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. (You will need to either standardize or normalize these variables so that they’re within the same range.)</p>
<p id="71e3" class="graf graf--p graf-after--p">What about <strong class="markup--strong markup--p-strong">synapses</strong>? Each of the synapses gets assigned weights, which are crucial to <strong class="markup--strong markup--p-strong">Artificial Neural Networks</strong> (ANNs). Weights are how ANNs learn. By adjusting the weights, the ANN decides to what extent signals get passed along. When you’re training your network, you’re deciding how the weights are adjusted.</p>
<h3 id="6f2f" class="graf graf--h4 graf-after--p">How do artificial neural networks learn?</h3>
<p id="f198" class="graf graf--p graf-after--h4">There are two different approaches to get a program to do what you want. First, there’s the specifically guided and hard-programmed approach. In this approach, you tell the program exactly what you want it to do. Then there are <strong class="markup--strong markup--p-strong">neural networks</strong>. In neural networks, you tell your network the inputs and what you want for the outputs, and let it learn on its own. By allowing the network to learn on its own, we can avoid the necessity of entering in all the rules. For a neural network, you can create the architecture and then let it go and learn. Once it’s trained up, you can give it a new image and it will be able to distinguish output.</p>
<figure id="66af" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="progressiveMedia js-progressiveMedia graf-image is-imageLoaded">
<p><div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*Qqhp5tU8hTHnvPn7" width="1600" height="2048" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@anniespratt?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Annie Spratt</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="5dc9" class="graf graf--p graf-after--figure">There are different kinds of neural networks. They’re generally classified into <strong class="markup--strong markup--p-strong">feedforward</strong> and <strong class="markup--strong markup--p-strong">feedback</strong> networks.</p>
<p id="2257" class="graf graf--p graf-after--p">A <strong class="markup--strong markup--p-strong">feedforward</strong> network is a network that contains inputs, outputs, and hidden layers. The signals can only travel in one direction (forward). Input data passes into a layer where calculations are performed. Each processing element computes based upon the weighted sum of its inputs. The new values become the new input values that feed the next layer (feed-forward). This continues through all the layers and determines the output. <span class="markup--quote markup--p-quote is-other">Feedforward networks are often used in, for example, data mining.</span></p>
<p id="30b9" class="graf graf--p graf-after--p">A <strong class="markup--strong markup--p-strong">feedback network</strong> (for example, a recurrent neural network) has feedback paths. This means that they can have signals traveling in both directions using loops. All possible connections between neurons are allowed. Since loops are present in this type of network, it becomes a non-linear dynamic system which changes continuously until it reaches a state of equilibrium. <span class="markup--quote markup--p-quote is-other">Feedback networks are often used in optimization problems where the network looks for the best arrangement of interconnected factors.</span></p>
<p id="cabc" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">The majority of modern deep learning architectures are based on artificial neural networks (ANNs).</span> They use many layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output of the previous layer for its input. What they learn forms a hierarchy of concepts. In this hierarchy, each level learns to transform its input data into a more and more abstract and composite representation.</p>
<figure id="c782" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/6cbda-fbeef-1d9v2-vusuyrn8vm6i4cdmw.png?resize=640%2C360" width="640" height="360" /><p class="wp-caption-text"><em>Image by ahmedgad on <a class="markup--anchor markup--figure-anchor" href="http://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="e5a5" class="graf graf--p graf-after--figure">That means that for an image, for example, the input might be a matrix of pixels. The first layer might encode the edges and compose the pixels. The next layer might compose an arrangement of edges. The next layer might encode a nose and eyes. The next layer might recognize that the image contains a face, and so on.</p>
<p id="0daf" class="graf graf--p graf-after--p">What happens inside the <strong class="markup--strong markup--p-strong">neuron</strong>? The input node takes in information that in a numerical form. The information is presented as an activation value where each node is given a number. The higher the number, the greater the activation.</p>
<p id="19f0" class="graf graf--p graf-after--p">Based on the connection strength (weights) and transfer function, the activation value passes to the next node. Each of the nodes sums the activation values that it receives (it calculates the <strong class="markup--strong markup--p-strong">weighted sum</strong>) and modifies that sum based on its transfer function. Next, it applies an activation function. An activation function is a function that’s applied to this particular neuron. From that, the neuron understands if it needs to pass along a signal or not. The activation runs through the network until it reaches the output nodes. The output nodes then give us the information in a way that we can understand. Your network will use a cost function to compare the output and the actual expected output. The model performance is evaluated by the cost function. It’s expressed as the difference between the actual value and the predicted value. There are many different cost functions you can use, you’re looking at what the error you have in your network is. You’re working to minimize loss function. (In essence, the lower the loss function, the closer it is to your desired output). The information goes back, and the neural network begins to learn with the goal of minimizing the cost function by tweaking the weights. This process is called <strong class="markup--strong markup--p-strong">backpropagation</strong>.</p>
<p id="bd65" class="graf graf--p graf-after--p"><em class="markup--em markup--p-em">Interested in learning more about cost functions? Check out</em><a class="markup--anchor markup--p-anchor" href="https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-alongside-applications" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em"> A List of Cost Functions Used in Neural Networks, Alongside Applications</em></a><em class="markup--em markup--p-em"> on Stack Exchange</em></p>
<p id="cc52" class="graf graf--p graf-after--p">In forward propagation, information is entered into the input layer and propagates forward through the network to get our output values. We compare the values to our expected results. Next, we calculate the errors and propagate the info backward. This allows us to train the network and update the weights. Backpropagation allows us to adjust all the weights simultaneously. During this process, because of the way the algorithm is structured, you’re able to adjust all of the weights simultaneously. This allows you to see which part of the error each of your weights in the neural network is responsible for.</p>
<p id="b154" class="graf graf--p graf-after--p"><em class="markup--em markup--p-em">Hungry for more? You might want to read </em><a class="markup--anchor markup--p-anchor" href="http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Efficient BackProp</em></a><em class="markup--em markup--p-em"> by Yann LeCun, et al., as well as </em><a class="markup--anchor markup--p-anchor" href="http://neuralnetworksanddeeplearning.com/" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Neural Networks and Deep Learning</em></a><em class="markup--em markup--p-em"> by Michael Nielsen.</em></p>
<p id="a29f" class="graf graf--p graf-after--p">When you’ve adjusted the weights to the optimal level, you’re ready to proceed to the testing phase!</p>
<h3 id="0fa7" class="graf graf--h4 graf-after--p">What is a weighted sum?</h3>
<p id="1bf2" class="graf graf--p graf-after--h4">Inputs to a neuron can either be features from a training set or outputs from the neurons of a previous layer. Each connection between two neurons has a unique synapse with a unique weight attached. If you want to get from one neuron to the next, you have to travel along the synapse and pay the “toll” (weight). The neuron then applies an activation function to the sum of the weighted inputs from each incoming synapse. It passes the result on to all the neurons in the next layer. When we talk about updating weights in a network, we’re talking about adjusting the weights on these synapses.</p>
<p id="1126" class="graf graf--p graf-after--p">A neuron’s input is the sum of weighted outputs from all the neurons in the previous layer. Each input is multiplied by the weight associated with the synapse connecting the input to the current neuron. If there are 3 inputs or neurons in the previous layer, each neuron in the current layer will have 3 distinct weights: one for each synapse.</p>
<h3 id="162a" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">What&#8217;s an activation function?</strong></h3>
<p id="4dcc" class="graf graf--p graf-after--p"><span class="markup--quote markup--p-quote is-other">In a nutshell, the activation function of a node defines the output of that node.</span></p>
<p id="6985" class="graf graf--p graf-after--p">The activation function (or transfer function) translates the input signals to output signals. It maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents the rate of action potential firing in the cell. It’s a number that represents the likelihood that the cell will fire. At it’s simplest, the function is binary: <strong class="markup--strong markup--p-strong">yes</strong> (the neuron fires) or <strong class="markup--strong markup--p-strong">no</strong> (the neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be anywhere in a range. If you were using a function that maps a range between 0 and 1 to determine the likelihood that an image is a cat, for example, an output of 0.9 would show a 90% probability that your image is, in fact, a cat.</p>
<figure id="f3af" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/4c099-774d7-15n8hphog6a9ieb3bhuibeq.jpeg?resize=640%2C359" width="640" height="359" /><p class="wp-caption-text"><em>Photo by minanafotos on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="0a92" class="graf graf--p graf-after--figure">What options do we have? There are many activation functions, but these are the four very common ones:</p>
<ul class="postList">
<li id="39b7" class="graf graf--li graf-after--p"><strong class="markup--strong markup--li-strong">Threshold function</strong> This is a step function. If the summed value of the input reaches a certain threshold the function passes on 0. If it’s equal to or more than zero, then it would pass on 1. It’s a very rigid, straightforward, yes or no function.</li>
</ul>
<figure id="804f" class="graf graf--figure graf-after--li">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/d9ba0-c4f48-1dc237qrcxqtca5z5uetvrw.png?w=1080" /></div>
</div><figcaption class="imageCaption"><em>Example threshold function</em></figcaption></figure>
<ul class="postList">
<li id="4d5b" class="graf graf--li graf-after--figure"><strong class="markup--strong markup--li-strong">Sigmoid function</strong>: This function is used in logistic regression. Unlike the threshold function, it’s a smooth, gradual progression from 0 to 1. It’s very useful in the output layer and is heavily used for linear regression. (Linear regression is one of the most well-known algorithms in statistics and machine learning).</li>
</ul>
<figure id="cebf" class="graf graf--figure graf-after--li">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/0c1c6-2d94b-1vlljgjp2n97e1t2bcci4hg.png?w=1080" /></div>
</div><figcaption class="imageCaption"><em>Example sigmoid function</em></figcaption></figure>
<ul class="postList">
<li id="f347" class="graf graf--li graf-after--figure"><strong class="markup--strong markup--li-strong">Hyperbolic Tangent Function</strong> This function is very similar to the sigmoid function. Unlike the sigmoid function which goes from 0 to 1, the value goes below zero, from -1 to 1. Although this isn’t what happens in biology, this function gives better results when it comes to training neural networks. Neural networks sometimes get “stuck” during training with the sigmoid function. This happens when there’s a lot of strongly negative input that keeps the output near zero, which messes with the learning process.</li>
</ul>
<figure id="5a07" class="graf graf--figure graf-after--li">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/9a678-bc209-1xrqab63j8sz8emqscqnx4q.png?w=1080" /></div>
</div><figcaption class="imageCaption"><em>Example hyperbolic tangent function (tanh)</em></figcaption></figure>
<ul class="postList">
<li id="c508" class="graf graf--li graf-after--figure"><strong class="markup--strong markup--li-strong">Rectifier function</strong> This might be the most popular activation function in the universe of neural networks. It’s the most efficient and biologically plausible. Even though it has a kink, it’s smooth and gradual after the kink at 0. This means, for example, that your output would be either “no” or a percentage of “yes.” This function doesn’t require normalization or other complicated calculations.</li>
</ul>
<figure id="fbf8" class="graf graf--figure graf-after--li">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/a5329-3965e-1clkglxsbu4p0rdf5iwcj_g.png?w=1080" /></div>
</div><figcaption class="imageCaption"><em>Example rectifier function</em></figcaption></figure>
<p id="c1ed" class="graf graf--p graf-after--figure"><em class="markup--em markup--p-em">Want to dive deeper? Check out </em><a class="markup--anchor markup--p-anchor" href="http://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Deep Sparse Rectifier Neural Networks</em></a><em class="markup--em markup--p-em"> by Xavier Glorot, et al.</em></p>
<p id="da25" class="graf graf--p graf-after--p">So let’s say, for example, your desired value is binary. You’re looking for a “yes” or a “no.” Which activation function do you want to use? From the above examples, you could use the threshold function, or you could go with the sigmoid activation function. The sigmoid function would be able to give you the probability of a yes.</p>
<figure id="f693" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*qrf7PK-wOnWDlfub" width="1600" height="1079" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@rawpixel?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">rawpixel</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="65af" class="graf graf--p graf-after--figure"><strong class="markup--strong markup--p-strong">So, how are the weights adjusted, exactly?</strong></p>
<p id="ae4c" class="graf graf--p graf-after--p">You could use a brute force approach to adjust the weights and test thousands of different combinations. Even with the most simple neural network that has only five input values and a single hidden layer, you’ll wind up with 10⁷⁵ possible combinations. Running this on the world’s fastest supercomputer would take longer than the universe has existed so far.</p>
<figure id="167c" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 650px" class="wp-caption alignnone"><img data-recalc-dims="1" loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/620a3-64ada-1ix5nuvicorqiotrzh9x1qq.jpeg?resize=640%2C426" width="640" height="426" /><p class="wp-caption-text"><em>Photo by skorchanov on <a class="markup--anchor markup--figure-anchor" href="https://pixabay.com/" target="_blank" rel="noopener noreferrer">Pixabay</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="e109" class="graf graf--p graf-after--figure">However, if you go with <strong class="markup--strong markup--p-strong">gradient descent</strong>, you can look at the angle of the slope of the weights and find out if it’s positive or negative in order to continue to slope downhill to find the best weights on your quest to reach the global minimum.</p>
<p id="f455" class="graf graf--p graf-after--p">If you go with <strong class="markup--strong markup--p-strong">gradient descent</strong>, you can look at the angle of the slope of the weights and find out if it’s positive or negative. This allows you to continue to slope downhill to find the best weights on your quest to reach the global minimum.</p>
<p id="81cf" class="graf graf--p graf-after--p"><strong class="markup--strong markup--p-strong">Gradient descent</strong> is an algorithm for finding the minimum of a function. The analogy you’ll see over and over is that of someone stuck on top of a mountain and trying to get down (find the minima). There’s heavy fog making it impossible to see the path, so she uses gradient descent to get down to the bottom of the mountain. She looks at the steepness of the hill where she is and proceeds down in the direction of the steepest descent. You should assume that the steepness isn’t immediately obvious. Luckily she has a tool that can measure steepness. Unfortunately, this tool takes forever. She wants to use it as infrequently as she can to get down the mountain before dark. The real difficulty is choosing how often she wants to use her tool so she doesn’t go off track. In this analogy, the person is the algorithm. The steepness of the hill is the slope of the error surface at that point. The direction she goes is the gradient of the error surface at that point. The tool she’s using is differentiation (the slope of the error surface can be calculated by taking the derivative of the squared error function at that point). The rate at which she travels before taking another measurement is the learning rate of the algorithm. It’s not a perfect analogy, but it gives you a good sense of what gradient descent is all about. The machine is learning the gradient, or direction, that the model should take to reduce errors.</p>
<figure id="f2c9" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/0c358-31ae2-1kmmjfbp5vrkkom1sp4urpa.png?w=1080" /></div>
</div>
</figure>
<p id="9765" class="graf graf--p graf-after--figure"><strong class="markup--strong markup--p-strong">Stochastic Gradient Descent</strong></p>
<p id="8362" class="graf graf--p graf-after--p">Gradient descent requires the cost function to be convex, but what if it isn’t?</p>
<figure id="e599" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/80060-b792c-1b7gpub8q4zvxdv4gcbsk3a.png?w=1080" /></div>
</div>
</figure>
<p id="80cf" class="graf graf--p graf-after--figure">Normal gradient descent will get stuck at a local minimum rather than a global minimum, resulting in a subpar network. In normal gradient descent, we take all our rows and plug them into the same neural network, take a look at the weights, and then adjust them. This is called batch gradient descent. In stochastic gradient descent, we take the rows one by one, run the neural network, look at the cost functions, adjust the weights, and then move to the next row. Essentially, you’re adjusting the weights for each row.</p>
<p id="4439" class="graf graf--p graf-after--p">Stochastic gradient descent has much higher fluctuations, which allows you to find the global minimum. It’s called “stochastic” because samples are shuffled randomly, instead of as a single group or as they appear in the training set. It looks like it might be slower, but it’s actually faster because it doesn’t have to load all the data into memory and wait while the data is all run together. The main pro for batch gradient descent is that it’s a deterministic algorithm. This means that if you have the same starting weights, every time you run the network you will get the same results. Stochastic gradient descent is always working at random. (You can also run mini-batch gradient descent where you set a number of rows, run that many rows at a time, and then update your weights.)</p>
<p id="45f2" class="graf graf--p graf-after--p">Many improvements on the basic stochastic gradient descent algorithm have been proposed and used, including implicit updates (ISGD), momentum method, averaged stochastic gradient descent, adaptive gradient algorithm (AdaGrad), root mean square propagation (RMSProp), adaptive moment estimation (Adam), and more.</p>
<p id="4772" class="graf graf--p graf-after--p"><em class="markup--em markup--p-em">Loving this? You might want to take a look at </em><a class="markup--anchor markup--p-anchor" href="https://iamtrask.github.io/2015/07/27/python-network-part2/" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">A Neural Network in 13 lines of Python-Part 2 Gradient Descent</em></a><em class="markup--em markup--p-em"> by Andrew Trask </em><strong class="markup--strong markup--p-strong"><em class="markup--em markup--p-em">and</em></strong><em class="markup--em markup--p-em"> </em><a class="markup--anchor markup--p-anchor" href="http://neuralnetworksanddeeplearning.com/" target="_blank" rel="noopener noreferrer"><em class="markup--em markup--p-em">Neural Networks and Deep Learning</em></a><em class="markup--em markup--p-em"> by Michael Nielsen</em></p>
<p id="e2f8" class="graf graf--p graf-after--p">So here’s a quick walkthrough of training an artificial neural network with stochastic gradient descent:</p>
<ul class="postList">
<li id="0344" class="graf graf--li graf-after--p">1: Randomly initiate weights to small numbers close to 0</li>
<li id="56bd" class="graf graf--li graf-after--li">2: Input the first observation of your dataset into the input layer, with each feature in one input node.</li>
<li id="f8a4" class="graf graf--li graf-after--li">3: <strong class="markup--strong markup--li-strong">Forward propagation</strong> — from left to right, the neurons are activated in a way that each neuron’s activation is limited by the weights. You propagate the activations until you get the predicted result.</li>
<li id="b4a9" class="graf graf--li graf-after--li">4: Compare the predicted result to the actual result and measure the generated error.</li>
<li id="d22e" class="graf graf--li graf-after--li">5: <strong class="markup--strong markup--li-strong">Backpropagation</strong> — from right to left, the error is back propagated. The weights are updated according to how much they are responsible for the error. (The learning rate decides how much we update the weights.)</li>
<li id="baa2" class="graf graf--li graf-after--li">6: <strong class="markup--strong markup--li-strong">Reinforcement learning</strong> (repeat steps 1–5 and update the weights after each observation) <strong class="markup--strong markup--li-strong">OR</strong> <strong class="markup--strong markup--li-strong">batch learning</strong> (repeat steps 1–5, but update the weights only after a batch of observations).</li>
<li id="bd2b" class="graf graf--li graf-after--li">7: When the whole training set has passed through the ANN, that is one epoch. Repeat with more epochs.</li>
</ul>
<p id="ea83" class="graf graf--p graf-after--li">There you have it! Those are the basic ideas behind what’s happening in an artificial neural network.</p>
<figure id="9969" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"></div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<p><div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*ucQnYNwCGU0B20Gb" width="1600" height="2149" /><p class="wp-caption-text"><em>Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@sammathews?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Sam Mathews</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></em></p></div></p>
</div>
</div><figcaption class="imageCaption"></figcaption></figure>
<p id="8536" class="graf graf--p graf-after--figure"><strong class="markup--strong markup--p-strong">Still with me? </strong><a class="markup--anchor markup--p-anchor" href="https://contentsimplicity.com/wft-is-image-classification/" target="_blank" rel="noopener noreferrer"><strong class="markup--strong markup--p-strong">Come on over to part 3</strong></a><strong class="markup--strong markup--p-strong">!</strong></p>
<p id="82f2" class="graf graf--p graf-after--p graf--trailing">(If anyone out there has any specific topics they want me to cover, leave a comment in the responses below and I’ll tackle them if I can!)</p>
<span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/simply-deep-learning-an-effortless-introduction/">Simply deep learning: an effortless introduction</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/simply-deep-learning-an-effortless-introduction/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">189</post-id>	</item>
		<item>
		<title>How to Set up Kaggle in Google Colab</title>
		<link>https://contentsimplicity.com/how-to-set-up-kaggle-in-google-colab/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-set-up-kaggle-in-google-colab&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-set-up-kaggle-in-google-colab</link>
					<comments>https://contentsimplicity.com/how-to-set-up-kaggle-in-google-colab/#respond</comments>
		
		<dc:creator><![CDATA[Anne B]]></dc:creator>
		<pubDate>Sun, 03 Mar 2019 22:26:55 +0000</pubDate>
				<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[colab]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Kaggle]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[setup]]></category>
		<guid isPermaLink="false">https://learnbasictech.wordpress.com/?p=30</guid>

					<description><![CDATA[<p>You know where all those datasets are and you know where you want them to go, but how do you easily move your datasets from Kaggle into Google Colab without a lot of complicated madness?</p>
<p>Let me show you!</p>
<p>The post <a href="https://contentsimplicity.com/how-to-set-up-kaggle-in-google-colab/">How to Set up Kaggle in Google Colab</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-image is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="170" data-permalink="https://contentsimplicity.com/getting-started-with-google-colab/pixabay_alexas_fotos-2/" data-orig-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/83d43-pixabay_alexas_fotos-1.jpg?fit=640%2C425&amp;ssl=1" data-orig-size="640,425" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pixabay_Alexas_Fotos" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/83d43-pixabay_alexas_fotos-1.jpg?fit=300%2C199&amp;ssl=1" data-large-file="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/83d43-pixabay_alexas_fotos-1.jpg?fit=640%2C425&amp;ssl=1" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/83d43-pixabay_alexas_fotos-1.jpg?resize=754%2C501&#038;ssl=1" alt="" class="wp-image-170" width="754" height="501" /></figure>


<figure id="bdae" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<p class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><span style="font-size:inherit;">You know where all those datasets are and you know where you want them to go, but how do you easily move your datasets from Kaggle into Google Colab without a lot of complicated madness?</span></p>
</div>
</figure>
<p id="697f" class="graf graf--p graf-after--p">Let me show you!</p>
<p id="70c5" class="graf graf--p graf-after--p">Discovering the joy that is <a class="markup--anchor markup--p-anchor" href="https://colab.research.google.com/" target="_blank" rel="noopener noreferrer">Google Colab</a> was definitely one of the smartest things I’ve done since getting started with deep learning, machine learning, and AI. Google Colab provides free GPU (for real!) to pretty much anyone who wants it. If you’re just getting started, you need to get on Colab! I wrote <a class="markup--anchor markup--p-anchor" href="https://towardsdatascience.com/getting-started-with-google-colab-f2fff97f594c" target="_blank" rel="noopener noreferrer">another article</a> that covers getting set up in Colab for the first time, but getting Kaggle up and running in Colab really deserves its own article.</p>
<figure id="bb9a" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*pZoICdLGDKbKRtYU" width="1600" height="2396" /><p class="wp-caption-text">Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@messisorder?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Oscar Söderlund</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="a22b" class="graf graf--p graf-after--figure">Although Colab is extremely user-friendly, there are a few details that you might want help with while getting yourself set up.</p>
<p id="2d4d" class="graf graf--p graf-after--p">Kaggle, it turns out, is one of those details.</p>
<p id="c7f8" class="graf graf--p graf-after--p">Kaggle needs a little finesse. A little love. However, if you’re after those sweet, sweet datasets, you want to get this working! It’s actually really simple; there are just a few easy steps you need to take. If you just want to <a class="markup--anchor markup--p-anchor" href="https://github.com/bonn0062/colab_kaggle_api" target="_blank" rel="noopener noreferrer">view the code on GitHub</a> and move on with your day (things can get a little…verbose…around here), you are welcome to do so!</p>
<p id="8b5d" class="graf graf--p graf-after--p">Here’s the simplest way I’ve found to access the Kaggle data for the first time:</p>
<h4 id="8e0d" class="graf graf--h4 graf-after--p"><strong class="markup--strong markup--h4-strong">Getting Started</strong></h4>
<p id="f2c2" class="graf graf--p graf-after--h4">(One quick note: in order to be able to access the Kaggle data, you’ll need to be signed up with Kaggle (free!) and agree to the terms and conditions of the competition that you want to participate in.)</p>
<p id="7911" class="graf graf--p graf-after--p">First, grab your token from Kaggle.</p>
<p id="17b2" class="graf graf--p graf-after--p">Go to your account page (the drop-down menu in the top right corner of the screen will take you there).</p>
<figure id="428b" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/89817-fd9fa-1f94g6dxm-8crxx5fftcqyg.png?w=1080" /></div>
</div>
</figure>
<p id="1674" class="graf graf--p graf-after--figure">Then scroll down to API and hit “Create New API Token.”</p>
<figure id="3e13" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/4e836-dfd79-1rhsgkthdgxwuoskaacslwa.png?w=1080" /></div>
</div>
</figure>
<p id="b5dc" class="graf graf--p graf-after--figure">That’s going to download a file called <strong class="markup--strong markup--p-strong">kaggle.json</strong>. Make sure you know where this file is! Maybe put it somewhere you can find it…</p>
<p id="5691" class="graf graf--p graf-after--p">Just a suggestion.</p>
<p id="b219" class="graf graf--p graf-after--p">Open the file and you’ll see something that looks a lot like this:</p>
<pre id="a4fe" class="graf graf--pre graf-after--p">{“username”:”YOUR-USER-NAME”,”key”:”SOME-VERY-LONG-STRING”}</pre>
<p id="2ada" class="graf graf--p graf-after--pre">Have that thing handy for a future copy-and-paste!</p>
<p id="3048" class="graf graf--p graf-after--p">Next, go to Colab and start a new notebook. I’m a big fan of getting up and running on GPU right away, and to do that, go to the “runtime” drop-down menu, select “change runtime type” and then select GPU in the “Hardware accelerator” drop-down menu. Then hit SAVE.</p>
<figure id="7cb6" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/3259a-34841-17du2-qmemro5nvxnepnukq.png?w=1080" /></div>
</div>
</figure>
<p id="59c5" class="graf graf--p graf-after--figure">Next, you’ll want to install Kaggle. It’s almost exactly like installing it in your Jupyter Notebooks, but Colab wants an exclamation point at the beginning of your code. Just run:</p>
<pre id="c44e" class="graf graf--pre graf-after--p">!pip install kaggle</pre>
<p id="c06f" class="graf graf--p graf-after--pre">You can use <code class="markup--code markup--p-code">!ls</code> to check if you already have a folder called Kaggle, or just run</p>
<pre id="4ac5" class="graf graf--pre graf-after--p">!mkdir .kaggle</pre>
<p id="0bb6" class="graf graf--p graf-after--pre">to create one.</p>
<p id="b8d5" class="graf graf--p graf-after--p">Next, you’ll want to run the cell below, but please pay attention to a couple of things:</p>
<ul class="postList">
<li id="e325" class="graf graf--li graf-after--p">there’s no exclamation point on this one</li>
<li id="0d92" class="graf graf--li graf-after--li">you definitely want to change the username and password to the ones you did that copy-and-paste on from your downloaded Kaggle file!</li>
</ul>
<pre id="2782" class="graf graf--pre graf-after--li">import json</pre>
<pre id="4419" class="graf graf--pre graf-after--pre">token = {“username”:”YOUR-USER-NAME”,”key”:”SOME-VERY-LONG-STRING”}</pre>
<pre id="d982" class="graf graf--pre graf-after--pre">with open('/content/.kaggle/kaggle.json', 'w') as file:
    json.dump(token, file)</pre>
<p id="8f2d" class="graf graf--p graf-after--pre">I did a copy-and-paste when I ran this code and actually had a little trouble. I have no idea why, but I had to delete and re-type the single apostrophes in the code above to get that cell to run properly. If you’re popping an error code for no discernable reason, give that a try!</p>
<p id="b9ba" class="graf graf--p graf-after--p">Next, run</p>
<pre id="71b5" class="graf graf--pre graf-after--p">!cp /content/.kaggle/kaggle.json ~/.kaggle/kaggle.json</pre>
<p id="89cd" class="graf graf--p graf-after--pre">and then</p>
<pre id="017b" class="graf graf--pre graf-after--p">!kaggle config set -n path -v{/content}</pre>
<p id="aa37" class="graf graf--p graf-after--pre">You’ll get a warning that looks like this:</p>
<figure id="31f3" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/8a928-87c7f-1pdrwskutg7sopcl7kmdczg.png?w=1080" /></div>
</div>
</figure>
<p id="5d21" class="graf graf--p graf-after--figure">You can easily fix that by running:</p>
<pre id="2d1e" class="graf graf--pre graf-after--p">!chmod 600 /root/.kaggle/kaggle.json</pre>
<p id="de92" class="graf graf--p graf-after--pre">After that, you should be able to run</p>
<pre id="066c" class="graf graf--pre graf-after--p">!kaggle datasets list</pre>
<p id="3a92" class="graf graf--p graf-after--pre">To access a list of Kaggle datasets.</p>
<figure id="7c5e" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/378fd-db14a-1krsozzyxzanblhwgtb8k7w.png?w=1080" /></div>
</div>
</figure>
<p id="8a60" class="graf graf--p graf-after--figure">If you’re looking for a specific dataset, you can run something like</p>
<pre id="ce99" class="graf graf--pre graf-after--p">!kaggle datasets list -s sentiment</pre>
<p id="11f1" class="graf graf--p graf-after--pre">in order to list, for example, datasets that include “sentiment” in their titles.</p>
<figure id="a13a" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/e4607-bb714-1qz1raihuk-hjqrv_jt71wg.png?w=1080" /></div>
</div>
</figure>
<p id="84fd" class="graf graf--p graf-after--figure">Now it’s time to start having real fun!</p>
<h4 id="a3ec" class="graf graf--h4 graf-after--p"><strong class="markup--strong markup--h4-strong">Downloading the Data</strong></h4>
<p id="7e03" class="graf graf--p graf-after--h4">Go to Kaggle, find the dataset you want, and on that page, click the API button (it will copy the code automatically).</p>
<figure id="db9b" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded"><img data-recalc-dims="1" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://i0.wp.com/contentsimplicity.com/wp-content/uploads/2019/03/3355d-37e17-1r1iyadbq5jzrcsareckffq.png?w=1080" /></div>
</div>
</figure>
<p id="86a7" class="graf graf--p graf-after--figure">You’ll paste that code into your next cell, but make sure you add that exclamation point to the beginning of the cell and add <code class="markup--code markup--p-code">-p /content</code>to clarify your path.</p>
<pre id="7398" class="graf graf--pre graf-after--p">!kaggle datasets download -d kazanova/sentiment140 -p /content</pre>
<p id="cc88" class="graf graf--p graf-after--pre">To unzip your files, run</p>
<pre id="2c70" class="graf graf--pre graf-after--p">!unzip *.zip</pre>
<p id="7a2c" class="graf graf--p graf-after--pre">Welcome to Data Town!!! Want to take a look? Try running:</p>
<pre id="c590" class="graf graf--pre graf-after--p">import pandas as pd</pre>
<pre id="d25a" class="graf graf--pre graf-after--pre">d = pd.read_csv('training.1600000.processed.noemoticon.csv')</pre>
<pre id="b5b0" class="graf graf--pre graf-after--pre">d.head()</pre>
<p id="71c2" class="graf graf--p graf-after--pre">(substitute a filename in your dataset for the filename above, of course.)</p>
<p id="4988" class="graf graf--p graf-after--p">Now get out there and create something amazing!</p>
<figure id="5f7e" class="graf graf--figure graf-after--p">
<div class="aspectRatioPlaceholder is-locked">
<div class="aspectRatioPlaceholder-fill"> </div>
<div class="progressiveMedia js-progressiveMedia graf-image is-canvasLoaded is-imageLoaded">
<div style="width: 1610px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="progressiveMedia-image js-progressiveMedia-image" src="https://cdn-images-1.medium.com/max/1600/0*LA95fw6BofrskJdQ" width="1600" height="2396" /><p class="wp-caption-text">Photo by <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/@fifernando?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-creator noopener noreferrer">Fidel Fernando</a> on <a class="markup--anchor markup--figure-anchor" href="https://unsplash.com/?utm_source=medium&amp;utm_medium=referral" target="_blank" rel="photo-source noopener noreferrer">Unsplash</a></p></div>
</div>
</div>
<figcaption class="imageCaption"></figcaption>
</figure>
<p id="6fd2" class="graf graf--p graf-after--figure graf--trailing">If anyone out there does something seriously awesome with their newly-gotten data, I want to hear about it! Please let everyone know what you’ve created in the responses below.</p><span class="et_bloom_bottom_trigger"></span><p>The post <a href="https://contentsimplicity.com/how-to-set-up-kaggle-in-google-colab/">How to Set up Kaggle in Google Colab</a> appeared first on <a href="https://contentsimplicity.com">Content Simplicity</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://contentsimplicity.com/how-to-set-up-kaggle-in-google-colab/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">161</post-id>	</item>
	</channel>
</rss>