Content Simplicity

How to Write and Publish Articles That Get Noticed

Anne B — Mon, 02 Nov 2020 13:21:00 +0000

Image by Ralf Kunze from Pixabay

This article first appeared in Towards Data Science

Simple techniques for creating content that’s easy to find and exciting to read

Want to learn how to write, publish, and get noticed?

I get a lot of views on Medium. About 100,000 every 30 days. As of April, I had been writing for four months and only wrote three to five articles each month.

It’s pretty exciting.

(Update: Thanks to all of you amazing human beings, I reached 100K readers in April!!!)

April-May 2019

Everyone wants to know how they can write, publish, and get their articles noticed in the endless expanse that is the internet. Whether you’re running a business, launching a new product, or putting everything you have into a blog, you want your content seen.

Most people think that there must be some kind of trick to it. There isn’t! You don’t need to be a part of a team that writes hundreds of articles a day. You don’t need to pay for views or hack any systems. There are a ton of simple and free things that you can do right now to make your content stand out and get noticed.

Just remember that you need to do what works for you! My posts might be different than yours and my goals might be different than yours. The joy for me is in sharing the cool stuff I know with as many people out in the world as possible. You might want something different.

The internet is full of some “common knowledge” information that people who write and publish swear by. These include things like:

Write shorter articles. Ones that take 6–8 minutes to read are ideal.
Publish frequently.
Publish on weekdays.
Find a great featured image.
Keep your paragraphs short.

These are great tips! But as you try what “everyone” says is effective, remember to always pay attention to what works best for you as you write, publish, and move forward.

You might be surprised.

Every couple of weeks I tend to write one 15- to 19-minute piece and publish it on a Saturday. That’s pretty much it.
You do you, boo.

Publish on Medium
There’s a good chance that you’re already doing this. But if you’re out there blogging all alone and wondering if anyone will ever notice your amazing work, republish your content on Medium! Medium has somewhere in the neighborhood of a gazillion views every month. Take advantage of this when you write and publish your work! You can easily import your content from your existing blog or website and Google will not punish you for it.
Importing your content is incredibly simple. Just click on your profile picture in the top-right corner, go to stories, click on “Import a story,” paste in your URL, and you’re basically done. The directions are right here and it’s crazy easy. Your original source will automatically be referenced by a canonical URL and both Google and your SEO will be happy.

Image by Eden Ware from Pixabay

Content is key

This is critical. I know everybody always says this part, but it’s important and I’m saying again. Write, publish, and share something that you care about and take your time with it. Put your heart and soul into it and then load it up with fun visuals.

Now spend as much time editing that thing as you did writing it.

Get Grammarly. The free version is great. It will edit as you go, saving you hours of effort and anxiety. Run your post through Hemingway App too. You want to write at an 8th-grade level or below. 6th grade seems to be the sweet spot for my articles. Hemingway will help you easily determine the reading level. This is not about dumbing your pieces down. I’m a top writer in artificial intelligence and technology and write articles at about a 6th-grade reading level.

Now spend as much time working on the title of your post as you did writing and editing your post. Seriously. The title can make or break you. You can look at headline analyzers (people seem to like CoSchedule) or just pay attention to which articles you actually click and read throughout your day.

Write out your title and then Google it. Look at the results.

If you were looking for exactly the information that you just wrote about, would you click on your title first?

Go hit that “Ready to publish?” button at the top of your screen to see what your article will look like. Would you click on that? Does it say what you want it to say? Does it accurately represent your content? (You’d be surprised at how easy it is to forget that part in the quest to be funny, clever, and/or attention-grabbing.) Did you include a power word? You don’t have to, but people do like them. Did you go too far and turn it into clickbait? Medium readers and curators generally don’t like clickbait, so it’s best if you avoid that. How does your featured image look? Is it interesting and exciting.

While it does make sense to use a featured image that works with your article, you don’t need to find an image that literally represents the content that you’ve written. Find an image that evokes an emotion that works with what you’ve written. Find an engaging image that makes someone want to get more information. That can be even more powerful than a literal representation of your content.

If you want page views outside of Medium, try Googling the main words in your title. Do you get a zillion hits? Are you ready to compete with that? It’s tempting to want to use keywords that get billions of links, but are you sure you can rank there? If you’re hoping people will find your article, the last thing you want is to end up on page 2,824,716 of a Google search.

They say if you’re anywhere past page two of a Google search, your article may as well not exist.

If you’re using a keyword tool, I’d suggest that you want to stay in the middle of the road. You’re looking for keywords that a lot of people are looking for, but not ones that absolutely everyone is writing about.
The most important thing to keep in mind is that you are joining thousands of other people who are putting their hearts and souls into their pieces and then tossing them into the vast, gaping void that is the internet.
Your job is to help people find what you’ve written. Make something that’s bright and shiny and then treat links like breadcrumbs along the way to finding your post. (We’ll get to links in a minute.)

Images, images, images. And GIFs!
Whitespace is your friend. You want short paragraphs with lots of whitespace. You want visuals. Get some good pictures! Medium offers Unsplash images inside of your post. Just click the plus sign on a new line and then the magnifying glass icon and you’ll have access to thousands of images. All you have to do is search for an image that makes you happy.

Click the plus sign

Click the magnifying glass to grab an image from Unsplash (or the camera for one of your saved images)

And search away!

If you want to step it up a little, check out Pixabay, Pexels, and any other free (or paid) sites. Take it to the next level and grab a GIF from GIPHY! You can find one you want, click on the little link icon, and grab the GIF link. Then come back to Medium, paste that link on its own line and hit “return.” Wait for a second or two (or twenty…), and your GIF will appear like magic.

Click the link icon on the GIF you like

(Paste the link on its own line in Medium and hit Return.)

Always make sure you put credits and links to the places where you found your article. If you didn’t take the picture, then use the caption space below the image link to the spot where you found it. If you don’t have the right to use it, then don’t use it.

Link to yourself

How often have you been wandering aimlessly around the internet and found an article you liked reasonably well? Fairly often, right? Do you then go hunting for more articles by the same writer? How often do you take the time to go and search for a writer’s profile or website?

…basically never?

What if they had a link to another one of their articles right there for you to click?

I’d guess that people are approximately one million times more likely to click a link to a related story than to go hunting for more pieces from a writer they stumbled across online.

Pay attention to how you interact with articles and stories. If you’ve been on Medium for a while, you might be used to looking for people’s profiles, but what about when you were new here?

Search for something on the internet and pay attention to what you do. Do you click on clickbait titles or do you avoid them? Do you read through big walls of text, or do you like short paragraphs and interesting pictures? Every time you like something that you’ve read, do you take a bunch of time to hunt for the writer?

We all like to believe that we’re completely unique, but the reality is that a lot of other people out there will behave almost exactly the way you do across the board. Pay attention to what you do!

Make life easier for your readers and they’ll almost certainly make life better for you.

Choose your tags wisely

You can use up to five tags when you publish your article. Use them all! Medium tells you right there how many followers those tags have. Some do better than others for views and interaction. Some have more followers than others. I like to choose one or two really big tags and three medium-sized ones. Keep in mind that the more people use a particular tag for their pieces, the faster your article in that tag might be buried.

Choosing tags that are big but not enormous works really well when you’re starting out.

If you want to be seen, you probably want to submit your work to a publication. That makes all the difference. You can check out Smedian for top publications, but it can be hard to have your pieces approved and published in the large publications when you’re new to this. Don’t be afraid to start with a smaller publication. Give it a shot!

When you have great, well-edited content with images and a great title with an eye-catching featured image and solid tags go ahead and publish it.

Congratulations!

Are you done?

Not even close.

Share it!

Get out there and share that article! Put it on social media. Link to it everywhere and ask your people to clap for it if you’re new at this. Text your mom and ask for claps. I know that’s no fun, but I promise you that after your story has about 50 claps or so, the claps come a lot easier for people. No one wants to give the first few claps, so beg, borrow, and steal them if you can. You won’t need to do this forever.

Think about places you can share your posts. Put that link on Facebook, Twitter, and LinkedIn. Put it on Instagram or Pinterest if that makes sense. Are you in any online groups where your link might be appropriate? Submit to Hacker News if your post is tech-related. Reddit, Stumble Upon, and Digg are always out there for sharing.

You have so many options!

If you want to go the extra mile and you have a few pieces to share, consider scheduling your posts through a social media management platform like Buffer. There’s a little work involved in loading your links and creating your posts. Once they’re in there, though, you can get things set up and forget about it for a few days. Buffer will figure out when the best times to post are and send out your social media stuff for you.

Don’t post your article once and then forget it if you want views. Keep putting it out there. No one remembers that link they saw on Twitter four days ago. Put it up again and someone new will see it!

Help people see you. The internet is enormous and no one is going to find you hiding alone out there in the dark, too proud to light a couple of flares. Use those links!

Google loves links, which is another great reason to make sure they’re out there. The Great Google Algorithm seems to prefer posts that have a lot of links to them. Go back to your old posts and add a link or two to the new one. Add a link in your new post to one or two of your old articles! Whenever you post, share a link to your article anywhere it makes sense to share it. Are you in any groups on social media where people share their posts? You should be! Share it there. Participate in those groups as much as you can.

It’s overwhelming to try to stay on top of it, but you want that community. Find a Facebook group or two! There are a lot of good ones out there. You might want to check out Medium Mastery, which is a solid and well-established group.

Keep writing

Whether you publish once a month or ten times a day, keep writing. The more often you can put good pieces out there, the more people will find you and read what you’ve written. Even publishing only every week or two, my stats took a hit when things got tough over here and I didn’t publish anything for three weeks.

When you finish a piece, take a minute to celebrate and then start the process over again.

Image by Pexels from Pixabay

Write something that you’re proud of, share it everywhere, and then write something even better and share that too. Don’t stop sharing. There are thousands of people out there who would love to read what you’ve written. Most of them will miss your post when you publish it. No one is checking every page of Medium every day. That would be impossible.

Raise your hand, shine some light, and share your hard work with the world.

Photo by Xan Griffin on Unsplash

As always, if you do anything cool with this information, let people know about it in the comments below or reach out anytime on LinkedIn @annebonnerdata!
If you want to take a look at some of the other pieces I’ve written for examples of whitespace, images, and post length, head on over to my profile.

Thanks for reading!

The post How to Write and Publish Articles That Get Noticed appeared first on Content Simplicity.

The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization

Anne B — Wed, 14 Oct 2020 17:13:22 +0000

Photo by Nitin Sharma from Pexels

How to take your model from unremarkable to amazing simply by cleaning and preprocessing your data

Data cleaning done right will change your life.

If you have a model that has acceptable results but isn’t amazing, take a look at your data! Taking the time to clean and preprocess your data the right way can make your model a star.

Photo by Burst from Pexels

In order to look at scraping and preprocessing in more detail, let’s look at some of the work that went into “You Are What You Tweet: Detecting Depression in Social Media via Twitter Usage.” That way, we can really examine the process of scraping Tweets and then cleaning and preprocessing them. We’ll also do a little exploratory visualization, which is an awesome way to get a better sense of what your data looks like! We’re going to do some of the most basic cleaning and preprocessing work here: it’s up to you to really get these Tweets in order when you’re building your model!

You Are What You Tweet
Detecting Depression in Social Media via Twitter Usage

towardsdatascience.com

A little background

More than 300 million people suffer from depression and only a fraction receive adequate treatment. Depression is the leading cause of disability worldwide and nearly 800,000 people every year die due to suicide. Suicide is the second leading cause of death in 15–29-year-olds. Diagnoses (and subsequent treatment) for depression are often delayed, imprecise, and/or missed entirely.

It doesn’t have to be this way! Social media provides an unprecedented opportunity to transform early depression intervention services, particularly in young adults.

Every second, approximately 6,000 Tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. Pew Research Center states that currently, 72% of the public uses some type of social media. This project captures and analyses linguistic markers associated with the onset and persistence of depressive symptoms in order to build an algorithm that can effectively predict depression. By building an algorithm that can analyze Tweets exhibiting self-assessed depressive features, it will be possible for individuals, parents, caregivers, and medical professionals to analyze social media posts for linguistic clues that signal deteriorating mental health far before traditional approaches currently do. Analyzing linguistic markers in social media posts allows for a low-profile assessment that can complement traditional services and would allow for a much earlier awareness of depressive signs than traditional approaches.

Where do we start?

We need data!

Photo by Quang Nguyen Vinh from Pexels

Gathering Data

In order to build a depression detector, there were two kinds of tweets that were needed: random tweets that do not necessarily indicate depression and tweets that demonstrate that the user may have depression and/or depressive symptoms. A dataset of random tweets can be sourced from the Sentiment140 dataset available on Kaggle, but for this binary classification model, this dataset which utilizes the Sentiment140 dataset and offers a set of binary labels proved to be the most effective for building a robust model. There are no publicly available datasets of tweets indicating depression, so “depressive” Tweets were retrieved using the Twitter scraping tool TWINT. The scraped Tweets were manually checked for relevance (for example, Tweets indicating emotional rather than economic or atmospheric depression) and Tweets were cleaned and processed. Tweets were collected by searching for terms specifically related to depression, specifically to lexical terms as identified in the unigram by De Choudhury, et. al.

TWINT is a remarkably simple tool to use!

You can download it right from the command line with:

pip install twint

If you want to, for example, search for the term “depression” on July 20, 2019 and store the data as a new csv named “depression,” you would run a command like:

twint -s "depression" --since 2019-07-20 -o depression —csv

Once you’ve gathered the Tweets, you can start cleaning and preprocessing them. You’ll probably wind up with a ton of information that you don’t need, like conversation ids and so on. You may decide to create multiple CSVs that you want to combine. We’ll get to all of that!

How did the model perform?

At first? Not that impressively. After a basic cleaning and preprocessing of the data, the best results (even after spending time fine-tuning the model) hovered around 80%.

The reason for that really made sense after I examined word frequency and bigrams. Explore your data! Once I looked at the words themselves, I realized that it was going to take a lot of work to clean and prepare the dataset the right way, and that doing so was an absolute necessity. Part of the cleaning process had to be done manually, so don’t be afraid to get in there and get your hands dirty. It takes time, but it’s worth it!

In the end? The accuracy of the model was evaluated and compared to a binary classification baseline model using logistic regression. The models were analyzed for accuracy and a classification report was run to determine precision and recall scores. The data were split into training, testing, and validation sets and the accuracy for the model was determined based on the model’s performance with the testing data, which were kept separate. While the performance of the benchmark logistic regression model was 64.32% using the same data, learning rate, and epochs, the LSTM model performed significantly better at 97.21%.

So how did we get from the scraped Tweets to the results?

Practice, practice, practice! (And some serious work.)

Photo by DSD from Pexels

Basic Data Cleaning and Preprocessing

Let’s say we scraped Twitter for the search terms “depression,” “depressed,” “hopeless,” “lonely,” “suicide,” and “antidepressant” and we saved those files of scraped Tweets as, for example, “depression” in the file “tweets.csv” and so on.

We’ll start with a few imports

import pandas as pd
import numpy as np

import pandas as pd  
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import re
from nltk.tokenize import WordPunctTokenizer
tok = WordPunctTokenizer()

We’ll read one of our CSV files and take a look at the head.

hopeless_tweets_df = pd.read_csv('hopeless/tweets.csv')
hopeless_tweets_df.head()

First of all, we should get rid of any of the information stored in the datasets that aren’t necessary. We don’t need names, ids, conversation ids, geolocations, and so on for this project. We can get those out of there with:

hopeless_tweets_df.drop(['date', 'timezone', 'username', 'name', 'conversation_id', 'created_at', 'user_id', 'place', 'likes_count', 'link', 'retweet', 'quote_url', 'video', 'user_rt_id', 'near', 'geo', 'mentions', 'urls', 'photos', 'replies_count', 'retweets_count'], axis = 1, inplace = True)

Now we have this, which is much easier to deal with!

Now just do that with all of the CSVs you created with your search terms and we can combine our separate datasets into one!

df_row_reindex = pd.concat([depression_tweets_df, hopeless_tweets_df, lonely_tweets_df, antidepressant_tweets_df, antidepressants_tweets_df, suicide_tweets_df], ignore_index=True)

df_row_reindex

Before we go any further, let’s drop the duplicates

depressive_twint_tweets_df = df.drop_duplicates()

And save our dataset as a new CSV!

export_csv = depressive_twint_tweets_df.to_csv(r'depressive_unigram_tweets_final.csv')

More Advanced Preprocessing

Before the data could be used in the model, it was necessary to expand contractions, remove links, hashtags, capitalization, and punctuation. Negations needed to be dealt with. That meant creating a dictionary of negations so that negated words could be effectively handled. Links and URLs needed to be removed along with whitespaces. Additionally, stop words beyond the standard NLTK stop words needed to be removed to make the model more robust. These words included days of the week and their abbreviations, month names, and the word “Twitter,” which surprisingly showed up as a prominently featured word when the word clouds were created. The tweets were then tokenized and PorterStemmer was utilized to stem the tweets.

Let’s take out all of the stuff that isn’t going to help us!

Imports, of course

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import itertools
import collections
import re
import networkx as nx

import nltk
nltk.download(['punkt','stopwords'])
from nltk.corpus import stopwords
stopwords = stopwords.words('english')
from nltk.corpus import stopwords
from nltk import bigrams

import warnings
warnings.filterwarnings("ignore")

sns.set(font_scale=1.5)
sns.set_style("whitegrid")
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()%matplotlib inline
%config InlineBackend.figure_format = 'retina'

Read in your new CSV


pd.read_csv('depressive_unigram_tweets_final.csv')

Turn it into a Pandas dataframe


df2 = pd.read_csv('depressive_unigram_tweets_final.csv')

Now let’s see if there are any null values. Let’s clean it up!

We’ll quickly remove stopwords from the Tweets with

df_new['clean_tweet'] = df_new['tweet'].apply(lambda x: ' '.join([item for item in x.split() if item not in stopwords]))

If you want to, you can analyze the Tweets for VADER sentiment analysis scores!

df_new['vader_score'] = df_new['clean_tweet'].apply(lambda x: analyzer.polarity_scores(x)['compound'])

From there, you can also create labels. For a binary classification model, you may want a binary labelling system. However, be aware of your data! Sentiment scores alone do not indicate depression and it is far too simplistic to assume that a negative score indicates depression. In fact, anhedonia, or loss of pleasure, is an extremely common symptom of depression. Neutral, or flat, Tweets are at least as likely, if not more likely, to be an indicator of depression and should not be ignored.

For the purposes of experimentation, you may want to set a sentiment analysis label like this. Feel free to play around with it!

positive_num = len(df_new[df_new['vader_score'] >=0.05]) negative_num = len(df_new[df_new['vader_score']<0.05])

df_new['vader_sentiment_label']= df_new['vader_score'].map(lambda x:int(1) if x>=0.05 else int(0))

If you need to, drop what you don’t need

df_new = df_new[['Unnamed: 0', 'vader_sentiment_label', 'vader_score', 'clean_tweet']]

df_new.head()

Go ahead and save a csv!


df_new.to_csv('vader_processed_final.csv')

Let’s keep playing!


df_new['text'] = df_new['clean_tweet']
df_new['text']

We can remove URLs

def remove_url(txt):
    return " ".join(re.sub("([^0-9A-Za-z \t])|(\w+:\/\/\S+)", "", txt).split())

all_tweets_no_urls = [remove_url(tweet) for tweet in df_new['text']]
all_tweets_no_urls[:5]

Now let’s make everything lowercase and split the Tweets.


#lower_case = [word.lower() for word in df_new['text']]
sentences = df_new['text']

all_tweets_no_urls[0].split()

words_in_tweet = [tweet.lower().split() for tweet in all_tweets_no_urls]
words_in_tweet[:2]

Data cleaning done manually

It’s not fun and it’s not pretty, but manual cleaning was critical. It took hours, but getting rid of references to things like tropical depressions and economic depressions improved the model. Removing Tweets that were movie titles improved the model (you can see “Suicide Squad” in the bigrams below). Removing quoted news headlines that included the search terms improved the model. It felt like it took an eternity to do, but this step made an enormous difference in the robustness of the model.

Exploratory Visualization and Analysis

Now let’s look at character and word frequency!

It‘s fairly easy to analyze the most common words found in the dataset. After removing the stop words, it was apparent that there were certain words that appeared much more frequently than other words.

Let’s count our most common words!

# List of all words
all_words_no_urls = list(itertools.chain(*words_in_tweet))

# Create counter
counts_no_urls = collections.Counter(all_words_no_urls)

counts_no_urls.most_common(15)

And turn them into a dataframe.

clean_tweets_no_urls = pd.DataFrame(counts_no_urls.most_common(15),
                             columns=['words', 'count'])

clean_tweets_no_urls.head()

Hmmm. Too many stopwords. Let’s deal with those.


stop_words = set(stopwords.words('english'))

# Remove stop words from each tweet list of words
tweets_nsw = [[word for word in tweet_words if not word in stop_words]
              for tweet_words in words_in_tweet]

tweets_nsw[0]

Let’s take another look.

all_words_nsw = list(itertools.chain(*tweets_nsw))  counts_nsw = collections.Counter(all_words_nsw)  counts_nsw.most_common(15)

Better, but not great yet. Some of these words don’t tell us much. Let’s make a few more adjustments.


collection_words = ['im', 'de', 'like', 'one']
tweets_nsw_nc = [[w for w in word if not w in collection_words]
                 for word in tweets_nsw]

Now


# Flatten list of words in clean tweets
all_words_nsw_nc = list(itertools.chain(*tweets_nsw_nc))

# Create counter of words in clean tweets
counts_nsw_nc = collections.Counter(all_words_nsw_nc)

counts_nsw_nc.most_common(15)

Much better! Let’s save this as a dataframe.


clean_tweets_ncw = pd.DataFrame(counts_nsw_nc.most_common(15),
                             columns=['words', 'count'])
clean_tweets_ncw.head()

What does that look like? Let’s visualize it!

fig, ax = plt.subplots(figsize=(8, 8))

# Plot horizontal bar graph
clean_tweets_no_urls.sort_values(by='count').plot.barh(x='words',
                      y='count',
                      ax=ax,
                      color="purple")

ax.set_title("Common Words Found in Tweets (Including All Words)")

plt.show()

Let’s look at some bigrams!


from nltk import bigrams

# Create list of lists containing bigrams in tweets
terms_bigram = [list(bigrams(tweet)) for tweet in tweets_nsw_nc]

# View bigrams for the first tweet
terms_bigram[0]

# Flatten list of bigrams in clean tweets
bigrams = list(itertools.chain(*terms_bigram))

# Create counter of words in clean bigrams
bigram_counts = collections.Counter(bigrams)

bigram_counts.most_common(20)

bigram_df = pd.DataFrame(bigram_counts.most_common(20),                              columns=['bigram', 'count'])  bigram_df

Certain bigrams were also extremely common, including smile and wide, appearing 42,185 times, afraid and loneliness, appearing 4,641 times, and feel and lonely, appearing 3,541 times.

This is just the beginning of cleaning, preprocessing, and visualizing the data. We can still do a lot from here before we build our model!

Once the Tweets were cleaned, it was easy to see the difference between the two datasets by creating a word cloud with the cleaned Tweets. With only an abbreviated TWINT Twitter scraping, the differences between the two datasets were clear:

Random Tweet Word Cloud:

Depressive Tweet Word Cloud:

Early in the process, it became clear that the most important part of refining the model to get more accurate results would be the data gathering, cleaning, and preprocessing stage. Until the Tweets were appropriately scraped and cleaned, the model had unimpressive accuracy. By cleaning and processing the Tweets with more care, the robustness of the model improved to 97%.

If you’re interested in learning about the absolute basics of data cleaning and preprocessing, take a look at this article!

The complete beginner’s guide to data cleaning and preprocessing
How to successfully prepare your data for a machine learning model in minutes

Thanks for reading! As always, if you do anything cool with this information, let everyone know about it in the comments below or reach out any time!

The post The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization appeared first on Content Simplicity.

Getting started with Git and GitHub: the complete beginner’s guide

Anne B — Fri, 04 Sep 2020 04:44:40 +0000

Photo by James Bold on Unsplash

Looking to get started with Git and GitHub? Do you need to collaborate with a team? Are you working on a project? Have you recently discovered that you pretty much need to be on GitHub if you want anyone to take you seriously in tech?

…do you really just want to contribute to your first open source project?

This one’s for you!

Photo by Greyson Joralemon on Unsplash

It’s totally easy to get started with Git. If you’re a fast reader (and you don’t take a lot of time with sign up and installation), you can be up and running on GitHub about ten minutes from right now.

If you go all the way through the article, you can practice cloning an existing repository, creating a branch, making changes, and creating a pull request.Along the way, you might also learn how to find your terminal, use terminal commands, and edit a markdown (.md) file!

If you do all that, congratulations!

You will have contributed to your first open source project — the GitHub Welcome Wall! (If you want to go straight to the open source contribution part, scroll down until you hit the section called, “Let’s do this!”)

This article will get you up and running with the basics. There’s a lot of stuff to learn if you want to use Git and GitHub like a pro, of course. You can go way beyond this introductory information! We’re going to leave the next-level stuff for another time, though.

Let’s get started!

What is Git? What’s GitHub?

Git is the version control tech of choice for basically everybody right now, from developers to designers. GitHub is the social code-hosting platform that’s currently used more than any other. It’s a place where you can play and experiment. It’s a place where you can find (and play around with) the most incredible open-source information, emerging technologies, features, and designs. It’s a place to learn and it’s a place to get involved. You can keep code there for work or for school, and you can grab some sweet code that you want to explore further. You can even host websites for free directly from your repository! (Our project is hosted right from the GitHub repository!)

Photo by Jamie Haughton on Unsplash

There are a ton of ways to use Git and GitHub, but getting started with GitHub doesn’t have to be overwhelming. You don’t need to be some kind of master coder or anything. You can even do the most important things right on the GitHub website!

That being said, it’s a good idea to find your terminal and get just the tiniest bit comfortable with it. Terminal commands make things so much faster! I’ll definitely show you how to get started using the GitHub website. I’ll also show you some terminal commands that you might want to use to make your life just a little bit nicer.

Any time you see a command in this article that includes these marks: < > , you want to delete those marks and replace what’s between them with your own information.

Let’s say you see something like git add . That means that you would type, for example, git add hello_world.py if you wanted to add a file named “hello_world.py” to your GitHub repository.

I’m going to give you a lot of explanation here, but these are all the terminal commands that you really need to know to get started:

git clone
git status
git add
git commit -m “ “
git push

That’s it! Those are the big ones! If you have a handle of those, you’re good to go. You can start working on your projects immediately!

Photo by Delaney Dawson on Unsplash

We’ll also talk about

git init
git branch
git merge
git checkout

You might be working with other people, or you might want to make changes and test them out before you really commit them. The commands above are what you need to get started with collaboration.

git help

is also seriously useful if you’re just starting out! We’ll discuss that too.

(If you’re on a Mac, you already have a terminal! You can search for it by clicking on the magnifying glass icon in the upper right-hand corner of your screen and searching for the word “terminal.” )

Step 1: Sign up and installation!

Go to GitHub and sign up for an account. You could just stop there and GitHub would work just fine. It’s a good idea, though, to install Git if you haven’t already. You can absolutely get started without it, but if you want to work on your local computer, then you want to have Git installed. You can download it or install it via your package manager instead.

Now go to your terminal and introduce yourself to Git! To set your username for every repository on your computer, type

git config --global user.name ""

replacing “” with your own name in quotations. You can use any name or handle you want. If you want to set your name for just one repository, leave out the word “global.”

Now you can tell Git your email, and make sure it’s the same email you used when you signed up for GitHub

git config --global user.email "<your_email@email.com>"

It’s easy to keep your email private, and you can find those instructions in this article. You only need to check two boxes in your GitHub account.

Now you’re ready to start using Git on your computer!

Photo by Matty Adame on Unsplash

To get started, you can create a new repository on the GitHub website or perform a git init to create a new repository from your project directory.

The repository consists of three ‘trees.’ First is the working directory, which holds the actual files. The second one is the index or the staging area. Then there’s the head, which points to the last commit you made.

I’m already comfortable with the terminal (Option 1)

Here’s how you can get started right from the terminal:

If you have a project directory, just go to your terminal and in your project directory run the command

git init

If you want to initialize your project with all of the files in your project directory, run

git init .

to include everything.

Let’s say you have a folder for your project called “new_project.” You could head on over to that folder in your terminal window and add a local repository to it by running

cd new_project
git init

Now you have a new hidden directory called .git in your project directory. This is where Git stores what it needs so that it can track your project. Now you can add files to the staging area one by one with

git add

or run

git add .

to add all of your files to the staging area. You can commit these changes with the command

git commit -m ""

and if you’re happy with your changes, you can run

git push

to push your changes through. You can check whether or not you have changes to push through any time by running

git status

If you made some changes, you can update your files on at a time with

git add

git add --all

Then commit them with your commit message and push them through.

That’s it! You can now initialize a repository, commit files, commit changes, and push them through to the master branch.

If you’ve got this, just scroll down to “Learning to work with others” to move on to branching and collaboration!

Photo by Jonathan Daniels on Unsplash

I don’t know what you just said (Option 2)

I’m going to assume that anyone who’s interested in option 2 is brand new to all of this and maybe has a folder full of files (or you plan to have one) that you want to put on GitHub and you just don’t know how to do that.

Let’s make that happen!

Say you want to create a new repository. (You probably do! That’s where your project will live. If you aren’t going to create a new repository, you probably want to clone an existing repository. We’ll talk about that next, but that’s how you grab someone else’s project and information that you need for your job or the course you’re taking.)

Your repository is where you’ll organize your project. You can keep folders, files, images, videos, spreadsheets, Jupyter notebooks, data sets, and anything else your project needs. Before you can work with Git, you have to initialize a repository for your project and set it up so that Git will manage it. You can do this right on the GitHub website.

It’s a smart idea to include a README file with information about your project. You can create one at the same time that you create your repository with the click of a checkbox.

Go to the GitHub website, look in the upper right corner, and click the + sign and then click “New repository.”

Name the repository, and add a quick description.

Decide whether you want this to be a public or a private repository
Click “Initialize this repository with a README” if you want to include the README file. (I definitely recommend doing this! It’s the first thing people are going to look at when they check out your repository. It’s also a great place to put information that you need to have in order to understand or run the project.)

New repository

Creating your new repository

You can totally start working right from this point if you want to! You can upload files, edit files, and so on right from your repository on the GitHub website. However, you might not be satisfied with only this option.

There are two ways to make changes to your project. You can make changes in your files/notebooks on your computer and you can also make changes right on GitHub.

Let’s say you want to make some changes to your README file right on GitHub.

First, go to your repository.

Click the name of the file to bring up that file (for example, click “README.md” to go to the readme file).

Click the pencil icon in the upper right corner of the file and make some changes.

Write a short message in the box that describes the changes you made (and an extended description if you want).
Click the “Commit changes” button.

Editing your file on GitHub

Committing your changes

Now the changes have been made to the README file in your new repository! (I quickly want to draw your attention to the little button you can check in the image above that will let you create a new branch for this commit and start a pull request. We’ll talk about this later!)

Pretty easy, right?

I prefer to work with files on my local computer rather than try to make everything work from the GitHub website, so let’s set that up now.

Gimmie that project!

You might want to clone your new repository so that you can work on it on your local computer, or you might have an existing repository that you want to clone. (That’s something you might need to do that for a project or course.)

In order to clone a repository onto your computer, go to the repository on the GitHub website and click the big green button that says “Clone or download.” (You can definitely download the repository right there and skip the terminal stuff if you just can’t deal with it. But I believe in you, so keep going!) Make sure it says “Clone with HTTPS.” Now click the clipboard icon to copy and paste it to your clipboard (or highlight that link and copy it).

Clone or download a repository

Now you’ll open up your terminal and get yourself to the place where you want that repository to land. You might be able to, for instance, type

cd Desktop

to get onto the desktop. Then clone your repository right there to make it easy to find. To clone the repository, you type

git clone

Simple! (Don’t forget to change the information between the < > marks to that string of letters and numbers you just copied! Also, make sure you delete the < >.)

If you haven’t moved around in your terminal before, you can move around slowly with the cd command until you get where you want to go. For example, open up your terminal and type ls to list the choices of where you might go next. You might see “Desktop” listed, and you could just type cd Desktop to get to your desktop. Then you can run the git clone command above to clone your repository right onto your desktop.

You might see some user names instead of choices like “Desktop.” In that case, you need to choose a user before you see “Desktop,” so choose the user with cd (replacing with the user name) and then type ls again to see your choices. There’s a very good chance you’ll see “Desktop” now. You’ll type cd Desktop if you see the Desktop listed. Now go ahead with that git clone!

If you ever want to move back a step in your terminal, just type cd ..

Now you have a new GitHub repository that you can work with cloned right on your desktop! That command pulled in a complete copy of the repository right to your system where you can work on it, make changes, stage the changes, commit the changes, and then push the changes back to GitHub.

You don’t need to put the repository on your desktop if you don’t want to. You can clone it anywhere. You can even run the git clone command as soon as you open up your terminal. I will say, though, that if you aren’t really comfortable navigating around your computer, it’s not a bad idea to have your project sitting right on your desktop where you can see it…

If you ever want to just play with a project on your own, you can fork it on the GitHub website instead of cloning it. Look up near the top right corner of the screen for the “fork” button and click it. This will make a copy of the repository in your repositories for you to play with on your own without doing anything to the original.

Now it’s time to add some files to your project!

Photo by Nadim Merrikh on Unsplash

This is all we’re about to do:

git status
git add
git commit -m " "
git push

Nothing to worry about!

I’m thinking you probably have some files that you want to put in your new repository. Go ahead and find your files and drag and drop them into the new folder for the repository that you created on your desktop, just like you normally would with any set of files you might want to move into a folder.

Now, check out the status of your project!

Go to your terminal and get yourself into the folder for your repository. Then run

git status

to see if everything is up to date. (If you just dragged some files into your project folder, it definitely isn’t!) To add one of your files to the repository, you would run

git add

Otherwise, you can add everything with

git add --all

or even

git add .

These are your proposed changes. You can do this exact same thing with brand new files and with files that are already in there but have some changes. You aren’t actually adding anything just yet. You’re bringing new files and changes to Git’s attention.

To commit the changes, you will start the process by running

git commit -m “”

You’re committing the changes to the HEAD, but not to the remote repository.(Make sure you replace that message in quotes with your own.) After you make a change, you take a “snapshot” of the repository with the “commit” command. You‘ll include a message on that “snapshot” with -m.

When you save a change, that’s called a commit. When you make a commit, you’ll include a message about what you changed and/or why you changed it.This is a great way to let others know what you’ve changed and why.

Now your changes are in the head of your local working copy. To send the changes to your remote repository, run

git push

to push your changes right into your repository. If you’re working on your local computer and you want your commits to be visible online too, you would push the changes up to git hub with the git push command.

You can see if everything is up to date any time by running the git statuscommand!

So now you have a GitHub repository and you know how to add files and changes to it!

Congratulations!!!

Learning to work with others

Collaboration is the name of the game on GitHub!

Photo by Quinten de Graaf on Unsplash

GitHub flow

Let’s say you have a project going and you maybe have a lot of different ideas and features in mind at any given time. Some features might be ready to go, but some might not. Maybe you’re working with other people who are all kind of doing their own thing. This is where branching comes in!

A branch is a separate space where you can try out new ideas. If you change something on a branch, it doesn’t affect the master branch until you want it to. This means that you can do whatever you want to do on that branch until you decide it’s time to merge it.

The only branch that’s going to permanently change things is the master branch. If you don’t want your changes to deploy immediately, then make your changes on a separate branch and merge them into the master branch when you’re ready.

If you’re working with others and want to make changes on your own, or if you’re working on your own and want to make changes without affecting the master branch, you want a separate branch. You can create a new branch at any time.

It’s also pretty simple to create a branch named “new_feature” in your terminal and switch to it with

git checkout -b new_feature

Once you create a branch, you can make changes on that branch. This makes it easy to see what you’ve changed and why you’ve changed it. Every time you commit your changes, you’ll add a message that you can use to describe what you’ve done.

Let’s talk about checkout!

git checkout

lets you check out a repository that you’re not currently inside of. You can check out the master branch with

git checkout master

or look at the “new_feature” branch with

git checkout new_feature

When you’re done with a branch, you can merge all of your changes back so that they’re visible to everyone.

git merge new_feature

will take all of the changes you made to the “new_feature” branch and add them to the master.

In order to create an upstream branch so that you can push your changes and set the remote branch as upstream, you will push your feature by running

git push --set-upstream origin new_feature

After you make some changes and decide you like them, you open a pull request. If you’re on a team, this is when other people on your team can start checking out your changes and discussing them. You can open a pull request at any point, whether it’s to have people look over your final changes or ask for help because you’re stuck on something.

Ummmmm…what? Can I do that on the website?

You can!

One way to do this is simply by checking that button that we mentioned earlier when we were editing the README file. Super easy!

You can also create a new branch any time right on the website by going to your repository, clicking the drop-down menu near the left-middle side of your screen that says “Branch: master,” typing a branch name, and selecting the “Create branch” link (or hitting enter on your keyboard). Now you have two branches that look the same! This is a great place to make changes and test them out before you want to make them affect the master branch.

Creating a branch

If you’re working on a separate branch, your changes only affect that branch.

If you’re happy with your changes and you want to merge your changes to the master branch, you can open a pull request. This is how, if you were on a team, you would propose your changes and ask someone to review them or pull in your contribution and merge them into their branch.

You can open a pull request as soon as you make a commit, even if you haven’t finished your code. You can do this right on the website if you’re more comfortable with that. If you’ve made some changes on your branch and you want to merge them, you can

Click the pull request tab near the top center of the screen

Click the green “New pull request” button

Go to the “Example Comparisons” box and select the branch you made to compare with the original branch.

Look over your changes to make sure they’re really what you want to commit.
Then click the big green “Create pull request” button. Give it a title and write a brief description of your changes. Then click “Create Pull Request!”

New pull request

Create pull request

Now if this is your repository, you can merge your pull request by clicking the green “Merge pull request” button to merge the changes into master. Click “Confirm merge,” then delete the branch after your branch has been incorporated with the “Delete branch” button in the purple box.

If you’re contributing to a project, people on the team (or the reviewer) might have questions or comments. If you need to change something, this is the time! If everything is good to go, they can deploy the changes right from the branch for final testing before you merge it. And you can deploy your changes to verify them in production.

If your changes have been verified, you can go ahead and merge your code into the master branch. The pull requests will preserve a record of your changes, which means that you can go through them any time to understand the changes and decisions that have been made.

Update and merge

If you’re working on your computer and want the most up-to-date version of a repository, you’d pull the changes down from GitHub with the git pullcommand. To update your local repository to the newest commit, run

git pull

in your working directory.

To merge another branch into your active branch, use

git merge

Git will try to auto-merge changes, but this isn’t always possible. Conflicts might arise. If they do, you’ll need to merge the conflicts manually. After changing them, you can mark them as merged with git add . You can preview your changes before you merge them with

git diff

You can switch back to to the master branch with

git checkout master

You’ll make your changes and then delete the branch when you’re done with

git branch -d new_feature

This branch isn’t available to anyone else unless you push the branch to your remote repository with

git push origin

Other helpful commands

First of all, this is my favorite GitHub cheatsheet. Check it out for all of the most useful Git commands!

You can see the commit history of the repository if you run

git log

You can see one person’s commits with

git log --author=

You can see what has been changed but not staged yet with

git diff

Need help remembering what command you’re supposed to run? Try

git help

to see the 21 most common commands. You can also type something like

git help clone

to figure out how to use a specific command like “clone.”

Let’s do this!

Photo by Mervyn Chan on Unsplash

Why not leave your mark and welcome everyone who’s here to learn about Git and GitHub? We’re going to create a simple welcome wall with notes from everyone who wants to try out Git and GitHub and contribute to their first open-source project.

You can add whatever you want to the welcome wall, as long as you keep it warm and encouraging. Add a note, add an image, whatever. Make our little world better in whatever way makes you happy. (If you’re an overthinker (I see you ❤️), I have a pre-written message in the README file that you can just copy and paste.)

Clone the repository, either on the GitHub website or by running

git clone https://github.com/bonn0062/github_welcome_wall.git

Create a new branch and add a welcoming and encouraging thought to the “welcome_wall.md” file. You can do this on the website, but I really encourage you to try cloning the repository to your computer, opening the file with your favorite text editor, and adding your message there. It’s just good learning!

Create a pull request.
Write a quick note describing your change and click the green button to create your pull request.

That’s it! If it’s a decent message, thought, image, or idea, I’ll merge your request and you will have successfully contributed to an open-source project.

Congratulations!!! You did it!

As always, if you do anything awesome with this information, I’d love to hear about it! Leave a message in the responses section or reach out any time on Twitter @annebonnerdata.

If you liked this article, you might want to check out:

Thanks for reading! ❤️

The post Getting started with Git and GitHub: the complete beginner’s guide appeared first on Content Simplicity.

The brilliant beginner’s guide to model deployment

Anne B — Fri, 04 Sep 2020 04:37:49 +0000

This article first appeared in Heartbeat by Fritz

You built this amazing machine learning model—this one, let’s say—but now what?

How do you take your model and turn it into something that you can display on the web? How do you turn it into something that other people can interact with? How do you make it useful?

You deploy it!

Photo by Collin Armstrong on Unsplash

Having the knowledge and ability to deploy your machine learning model is an absolute necessity. Whether you’re building a model or generating reports, you need this skill. It takes that model that you poured your blood, sweat, and tears into and turns it into something that absolutely anyone can play with and admire.

This article will walk you through the basics of deploying a machine learning model. We’re going to deploy a PyTorch image classifier with Flask. This is the first critical step towards turning your model into an app.

By the end of this article, you’ll be able to take a PyTorch image classifier and turn it into a cool web app. In this app, users will be able to upload an image of a flower to see what kind of flower it is.

Your deep learning image classifier will now be an awesome image prediction app.

Let’s get started!

First, we should hit the basics. (You can find the official installation guide here if you want to take a look!)

It’s a good idea to set up a virtual environment to manage the dependencies of your project. You can do that by setting up a folder for your project, then going to your terminal and running:

mkdir myproject

cd myproject

python3 -m venv venv

I should let you know now that everything that I’m going to do here works on a Mac with Python 3. If you’re working on Windows or running Python 2, you might want to head on over to the official documentation to see what you might need to tweak to get up and running.

Next, activate your environment.

. venv/bin/activate

Now we can install Flask.

pip install Flask

You’re ready to go!

The quickstart guide is a really helpful document to check out if you’re interested in learning a bit more about the basics. I’m going to start you out with a little information that’s very similar to the information provided in that guide. There isn’t a better or clearer explanation of the basics of Flask than that one.

To create a seriously minimal Flask application, you start by creating a file.

Create the file and open it in your favorite text editor. Then type

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, there!'

What does the code above do?

First of all, we imported the Flask class. Next, we created an instance of the class. The first argument is the name of the application’s module. If you’re using a single module, you’ll use (__name__) so that Flask knows where to look for stuff. The “route” part tells Flask what URL is supposed to trigger our function. We give the function a name that’s also used to generate URLs for that function and returns the message we want to display in the user’s browser.

You can save this as hello.py or whatever.py or anything else that makes you happy. Just don’t save it as flask.py because that will conflict with Flask. I like to go with app.py for the main flask file because that’s going to be what Flask wants to find later.

If you want to run it, go to your terminal and type

export FLASK_APP=app.py

and then

flask run

If everything’s working, you’ll see something like this

Running on http://127.0.0.1:5000/

Now you can click (command-click) on that web address or copy and paste it into your browser. See if it works!

(Any time you want to shut it down, just type control-C in your terminal window.)

Now, here’s the thing I really like to run when I’m trying to create something in Flask:

export FLASK_ENV=development

I run that command before I run flask run. This puts you in development mode. That means that instead of having to do a manual restart every single time you make a change to your code, your server will reload itself when you change your code. It will also provide you with a seriously helpful debugger when things go wrong!

Image by Miryams-Fotos on Pixabay

That being said, putting flask into development mode presents a major security risk, so you never, ever, ever want to use it on production machines.

The quickstart guide also tells you how to bind functions to meaningful URLs. That makes it easier for people to come back to your web app, how to create unique URLs, how to render templates, and more! It walks you through how to read and store cookies, how to upload files, and how to set up redirects and errors. Check it out if you’re looking for more of the basics.

Image by DariuszSanowski on Pixabay

On to our project!

You’ll want to begin with the imports, so go to your terminal and run

import flask
import torch 
import gunicorn 
import PIL

We’ll make a folder for this project and work within it. (If you didn’t create the folder and file earlier, do that now.) Create a folder for this project, navigate to your folder in the terminal and run the commands below one line at a time. Copy the app.py code from the example above and put it in the app.py file if you want to make sure your new web app is working.

sublime app.py
python app.py
flask run

(The command “sublime app.py” below will only work if you want to work in Sublime and have the shortcut set up. You can skip that and just go to your preferred text editor and create a new file called “app.py” if you prefer another text editor.)

You can command-click on the link that shows up, just like we did earlier, or copy and paste it into your browser.

You don’t want to just throw everything up in a string, so create a folder called “templates.” In “templates,” create one file called “index.html” and one file called “result.html.”

Open up index.html in your text editor and set up an HTML template. If you’re using Sublime, you can type html to create a basic HTML template.

Put the name of your project in the title and add “hello, there” between

and

in the body section.

Go back to your app.py file, add render_templateto the first line and replace “Hello, there!” with render_template(‘index.html’)

from flask import Flask, request, render_template
app = Flask(__name__)

@app.route('/')
def hello_world():
    return render_template('index.html')

Flask will take a look at app.py, then reach into your templates folder and pull up index.html, which we have set to display “Hello, there!” If you restart your page, you’ll see this

You can tell flask to restart whenever we save our changes by adding

if __name__ == '__main__':
 app.run(debug=True)

to the bottom of the app.py page to work in debug mode.

You can easily pass in values by changing your app.py file to something like this

from flask import Flask, request, render_template
app = Flask(__name__)

@app.route('/')
def hello_world():
 return render_template('index.html', value='hi')

if __name__ == '__main__':
 app.run(debug=True)

and your index.html file to this




 Flower Classifier


 
Hi there! {{ value }}

We’re up and running! However, if we want to build a web app that will allow users to upload a file or an image and display the results, we’ll want to build an app that accepts both “GET” and “POST” methods.

To do that, we change our app.py file to

from flask import Flask, request, render_template
app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def hello_world():
 if request.method == 'GET':
  return render_template('index.html', value='hi')
 if request.method == 'POST':
  return render_template('result.html')

if __name__ == '__main__':
    app.run(debug=True)

and we’ll change our index.html file to




 Flower App


 Upload your flower image

Remember that result.html file we created? Now you want to have something in there, so open up that file and add

Flower App

Prediction

Flower Name: Lily

Now you should be able to reload your browser window (if you’re running in debug mode), upload an image, and see this as your result

Right now, our results page will just say that your image is a lily no matter what you upload.

Congratulations if you’ve gotten this far!

You’re going to want to be able to render the results, and doing that is incredibly simple. If you want to test it out, change your app.py file so that you render your results this way: return render_template('result.html", flower=flower_name) (you’re just adding the second part). Next, replace the “Flower Name” line in your results.html file to read

Flower Name: {{ flower}}

Now you’re going to create some inference! I’m going to assume that you have an image classifier created using PyTorch with a saved checkpoint. You’ll need that to actually make this work! Put that checkpoint in your project folder.

If you don’t have a checkpoint file, check out this article on creating a seriously accurate image classifier in PyTorch. It gives you all of the code you need to create an image classifier and create that checkpoint.

Now we need to write a way to grab the image and send the info to the template. First, you’ll need a function to get the model and create your prediction. Create a commons.py file and write a function to get the model as well as something that will allow you to convert the uploaded file into a tensor. Try this!

“`

Next, create an inference.py file. You need to be able to sort out the flower names, classes, and labels, so you can write something like this:

Update your app.py file so that it reads

(If you’re paying attention, you’ll see that I added a couple of lines in the code above to make sure that you’ll get an error message if your file wasn’t uploaded.)

Make sure your result.html file reads something like this:

and you should be able to upload an image and get a result!

Image upload

Results!

That’s it!

You now have a working web application built on your image classifier that can upload an image of a flower and predict its species!

Now it’s up to you to refine your classifier and model. You can figure out how to make your classifier faster and more accurate. (Looking for ways to finetune your model? Check out the official tutorial first! After that, check out this article by Florin Cioloboc and Harisyam Manda. It’s full of great suggestions.) You may want to add code that can let people know if they’ve uploaded the wrong kind of file. You may decide you want people to see the top five species results or the probability that their flower is, in fact, the species that your classifier predicted. What you do from here is up to you!

…you might also want to make this thing look a little sexier.

By taking three minutes to insert a little CSS in my index.html file plus an image in a separate folder,

I went from this

to this!

This is just the most basic example of how to deploy a PyTorch image classifier to Flask. You can do anything from here!

Our next step will be to turn this baby into an app, so stay tuned. Also, if you want to take a look at this code and the folder structure, you’re welcome to check out this basic model deployment GitHub repo.

As always, if you create anything awesome, please share it in the responses below or reach out any time on Twitter @annebonnerdata!

The post The brilliant beginner’s guide to model deployment appeared first on Content Simplicity.

Get Involved With SciPy

Anne B — Fri, 27 Sep 2019 12:37:00 +0000

SciPy wants your thoughts on its technical documentation and user guides

You’ve heard of SciPy.

You’ve probably used it.

You might have even looked through some of the technical documentation and user guides. You might even have an opinion of the documentation…

But have you given any thought to getting involved with SciPy and letting them know how they can improve their documentation? Telling SciPy what you like and what you don’t like or how you think the documentation can be improved?

Now’s your chance!

What is SciPy?

It’s scientific (Sci) Python (Py)! SciPy is a free and open-source Python library. It’s used for scientific computing and technical computing. It contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.

SciPy uses NumPy arrays as the basic data structure. It has modules for various commonly used tasks in scientific programming. These tasks include integration (calculus), ordinary differential equation solving, and signal processing.SciPy builds on the NumPy array object. It’s part of the NumPy stack. The stack includes tools like Matplotlib, Pandas, and SymPy, and an expanding set of scientific computing libraries.

How can you get involved?

Take a quick survey!

While I’m over at NumPy working on creating a section in the technical documentation aimed at beginners, Maja Gwozdz is hard at work in the SciPy docs. She’s combing through the SciPy documentation to create something that’s even more helpful for you. She’s reaching out the whole community (that’s you!) to find out what you like and don’t like, and she would love your input!

As Maja wrote in her proposal for Google Season of Docs:

“I intend to work on the refactoring of the existing documentation so that it would be easily accessible by users with different needs. It goes without saying that a researcher is most likely interested in advanced and subtle features, whereas a user without prior expertise appreciates step-by-step guides and diagrams.

I am interested in pursuing this project for personal and professional reasons: first of all, I would like to contribute significantly to SciPy because my own research has greatly benefited from it and secondly, I encounter insufficient (or lacking) documentation all too often in other software and always wonder how much faster (if it all!) users could learn how to use the code had they been provided with a thorough guide.”

Maja put together a survey for everyone in the community who wants to be heard. This is an amazing opportunity to raise your hand and get involved. You can find the survey here and it’s designed to let you give as much of your time and input as you feel like giving.

The questions are very straightforward and most of them have simple multiple-choice answers. You’ll answer questions like, “What parts of the documentation do you use?” and “Which of the documentation features should be improved/added?” Below the multiple-choice questions, you can add your own comments and suggestions.

It’s quick, it’s easy, and it’s incredibly helpful. If you’ve used SciPy and the SciPy documentation, Maja would love to hear from you. It is time to get involved with SciPy.

Take a minute or two to speak up and be heard!

Photo by White Gold Photography from Pexels

Featured photo by Lum3n.com from Pexels

The post Get Involved With SciPy appeared first on Content Simplicity.

NumPy and SciPy and Google Season of Docs, Oh My: Meet Maja Gwózdz

Anne B — Wed, 25 Sep 2019 12:21:00 +0000

Learn more about the technical writers paired with NumPy and SciPy during Google Season of Docs

Welcome! From September through November, our little corner of the open-source world is going to involve technical documentation updates at NumPy and SciPy!

A behind-the-scenes tour

You get to go behind the scenes to meet the people and learn about some of the work we’re doing right now with the technical documentation at NumPy and SciPy.

A few weeks ago, I told you I would let you know more about the behind-the-scenes action and the technical writers who are going to be working with NumPy and SciPy during Google Season of Docs.

Photo by Pixabay from Pexels

It’s time to meet Maja!

Maja has done some knockout research, which you can find here. She has not only had significant experience with SciPy, but she’s well aware of what a difference great documentation and guides can make. Because it’s so easy for technical writers to get lost in the background of a project, I wanted to take this space to let you know what she’s working on in her own words.

If you aren’t familiar with what we’re doing with NumPy and SciPy through Google Season of Docs, you can read all about it here:

What do You Want to See in the NumPy Docs?
Behind the scenes at NumPy and SciPy with Google Season of Docs

While I’m building a new beginner-oriented technical documentation section with NumPy, Maja is working with SciPy to restructure its existing documentation.

Meet Maja Gwózdz!

I made a couple of very minor tweaks, but here’s what Maja had to say about herself and her plans for SciPy and Season of Docs:

About Maja

I completed a BA in English Studies with distinction (Jagiellonian University, Poland) and then obtained an MPhil in Theoretical and Applied Linguistics with distinction from the University of Cambridge. I then decided to pursue a BSc in Computer Science (at the Ludwig Maximilian University in Munich) and take additional courses in mathematics (so far, I have completed the following extension courses either from UC Berkeley or the University of Illinois: Elementary Number Theory, Calculus II, Precalculus, Python Programming). Other relevant technical courses I have taken so far are: Real Analysis, Linear Algebra, Introduction to Programming, Algorithms and Datastructures, Discrete Mathematics and Logic, Introduction to Functional Programming, Introduction to Artificial Intelligence, Logic, Computer Architecture. As regards machine learn- ing, I have a working knowledge of statistics, the Multilayer Perceptron Classifier (especially its application to automatic speech recognition), and other popular Artificial Neural Networks.

While I am not a technical writer in the strictly professional sense of the word, I am familiar with Sphinx and I have performed the tasks of a technical writer on several occasions. For instance, I completed an internship at Lufthansa CityLine, where I was responsible for running penetration tests and writing a technical report on network vulnerabilities. I was also responsible for designing a JIRA / Confluence workflow and preparing a basic guide for internal users. I was a student at GSoC 2018 (it was a project on corpus linguistics involving, among other tasks, the creation of annotation guidelines) and I am currently a mentor at the same organisation (CLiPS, the University of Antwerp).

I am passionate about clear and logical communication of technical matters and I believe that this project suits my background perfectly because I have the required linguistic tools to convey complex ideas plus the necessary mathematical / computer science knowledge to comprehend the subject (or, at least, know how to ask the right questions about the given matter).

I pay great attention to detail but, at the same time, try not to lose sight of the big picture. Whenever I notice that I spend too much time on a less urgent task, I quickly move on to the important phases, so as to meet the deadline (time permitting, I take care of the less urgent issues, of course). Getting stuck is the natural part of any creative process and it is, indeed, valuable but if it becomes a true obstacle I never hesitate to ask for help. This approach has worked very well in my previous projects and I intend to apply it to subsequent endeavours. In the interaction with supervisors and team colleagues, I particularly like constructive criticism and frequent feedback. While support and positive comments are undoubtedly important, I have never made significant progress based on praise alone. I enjoy challenging tasks and, as regards my approach to solving real-life software problems, I believe that actively listening to community members and global users is THE way to create excellent software. It would be an honour to work on SciPy.

Motivation

I intend to work on the refactoring of the existing documentation, so that it would be easily accessible by users with different needs. It goes without saying that a researcher is most likely interested in advanced and subtle features, whereas a user without prior expertise appreciates step-by-step guides and diagrams.

I am interested in pursuing this project for personal and professional reasons: first of all, I would like to contribute significantly to SciPy because my own research has greatly benefited from it and secondly, I encounter insufficient (or lacking) documentation all too often in other software and always wonder how much faster (if it all!) users could learn how to use the code had they been provided with a thorough guide.

Goals

I aim to improve the existing SciPy documentation both content- and graphic-wise. The most important feature of my approach to this problem is the deployment and analysis of the user survey, that is to say, a concise survey conducted online enabling various users to voice their needs regarding the documentation. I strongly believe that their opinions should be the source of inspiration (how else can we create more user-friendly documentation?).

As regards the realisation of the project itself, the first phase will involve designing and analysing the user survey, as well as tackling several stylistic issues I have noticed in the current documentation. For instance, lack of consistency (example: 2-dimensional arrays occurring alongside two-dimensional arrays), convoluted sentences that ought to be rewritten, or the lack of alphabetical order in certain subpages. The second phase will focus on the introduction of graphical guides containing hyperlinks to the relevant topics (based on the survey results and other community requests). In the long run, I wish to achieve a satisfactory documentation tailored to different kinds of users. Moreover, I will attempt to render the tutorials more consistent both linguistically and structurally. Last but not least, I aim to write new tutorials (based on the current community needs).

User survey

As regards the user survey, I propose to use Google Forms for several reasons. First of all, Google Forms is free and offers unlimited functionality (in terms of the number of respondents, questions, etc.), it has an appealing visual form, the most useful survey options (for instance, the customisable linear scale, checkboxes, and multiple choice), and, most importantly, the results can be easily exported for the purposes of statistical analysis. Based on online research, it appears that Google Forms is, at least for now, the best free tool for conducting surveys. On a less serious note, it would be a nice gesture to use a Google product in a Google-run initiative.

I have created a preliminary survey with sample questions (it can be accessed here). A reasonable number of questions in the final version ought to be between ten and fifteen. In order to obtain concrete results, I suggest that we predominantly use multiple-choice questions, a linear scale, and a few checkboxes. The linear scale should not resemble a full spectrum, though (it only causes confusion and the results are likely to suffer from high dispersion). There ought to be at maximum two open-ended questions, otherwise, the results will be highly dispersed and not helpful at all. I reckon that even a very high number of responses would not be problematic due to the fact that the data can be easily exported and analysed automatically with statistical software. Assuming that the number of responses is, indeed, very high, the analysis of open-ended questions could be a little time-consuming but I presume that it will not be overwhelming. After all, an average user is not likely to write an essay about the state of the documentation. In the worst-case scenario, some answers can be simply stored for future analysis.

Graphical guides

My vision of the graphical guides (intended to serve as navigational tools) is based on a popular premise that (most) humans are better at processing straightforward visual structures rather than purely text-based information. Moreover, a thematically-oriented diagram with lines connecting similar topics of interest is, undoubtedly, a highly valuable asset for less experienced users (and not only).

As regards the implementation details, I propose to use the TikZ package. First and foremost, it is a powerful tool and does not seem to be at risk of being deprecated soon. It also offers high-quality output, has really solid documentation, and is a frequent topic on TeX StackExchange and other mainstream forums. Most importantly, the integration of a TikZ file (more precisely, the numerous hyperlinks therein) with HTML documentation does not appear to pose significant problems due to the existence of various packages and fixes for embedding a TikZ picture in HTML (for instance, TeX4ht).

The question of future maintenance of the guides within SciPy can be easily solved by using, say, Overleaf (facilitates collaboration plus offers an instant preview) and predefined templates that I will supply. Basically, the graphical guides are not likely to differ hugely from one another. The structure, colour palette, and shapes are, more or less, going to be invariant, therefore subsequent re-shaping and further customisation will not be an issue. A rough sketch of such a guide (observe the counter-clockwise alphabetical order in the subcategories) is provided on the next page1. The complete diagram will, of course, contain hyperlinks to the respective sections in the documentation.

Featured Photo by Philipp Deus from Pexels

The post NumPy and SciPy and Google Season of Docs, Oh My: Meet Maja Gwózdz appeared first on Content Simplicity.

NumPy and SciPy and Google Season of Docs, Oh My: Meet Christina Lee

Anne B — Sat, 21 Sep 2019 12:15:00 +0000

Learn more about the technical writers paired with NumPy and SciPy during Google Season of Docs

From September through November, our little corner of the open-source world is going to involve technical documentation updates at NumPy and SciPy!

Welcome to NumPy and SciPy!!!

You’re going behind the scenes to meet the people and learn about some of the work we’re doing right now at NumPy and SciPy.

A couple of weeks ago, I told you I would let you know more about the technical writers who are going to be working with NumPy and SciPy during Google Season of Docs. It’s time to meet Christina Lee!

If you aren’t familiar with the project, you can read all about it here:

What do You Want to See in the NumPy Docs?
Behind the scenes at NumPy and SciPy with Google Season of Docs

What is Google Season of Docs?

Google did an amazing thing by creating Season of Docs. It built real opportunities for technical writers to collaborate with open source organizations.

Season of Docs is a three-month mentoring program that pairs technical writers with open source organizations. Writers have the opportunity to work with well-known and highly-regarded organizations. Open source organizations (who often don’t have a budget for technical writers) have the opportunity to work with experienced technical writers to improve and expand their existing documentation.

It’s pretty incredible.

I’m working with NumPy! Just to make things even cooler, there’s so much overlap between the NumPy and SciPy projects, that we get to meet frequently and collaborate with each other. That means that I get to update all of you with the changes we’re making!

Photo by Pixabay from Pexels

Since I hadn’t yet learned a lot about Christina when I wrote the last post, it seemed like a good idea to use today’s post to introduce her to you.

I made a couple of very minor tweaks, but here’s what Christina had to say about herself and her plans:

Meet Christina Lee!

Overall, I want to improve SciPy.org and docs.scipy.org’s design and structure.

I’m returning to Python after being a Julia programmer, so I might be helpful for newbie proofing Python code. I write Julia Jupyter notebooks on a variety of physics and numerics topics, available at albi3ro.github.io/M4 . At JuliaCon, I gave a lightning talk on “Teaching with Code”, written up at http://albi3ro.github.io/M4/Teaching_With_Code.html, which summarizes my code teaching ideals.

From her proposal:

Work on both the SciPy website and docs.scipy needs to start with a structural and graphical overhaul. At each page, I cannot instinctively tell how to navigate to what I want, what the purpose of the page is, or what the page wants me to feel and do. While Sphinx may be the tool of choice for documentation, we can pull away from Sphinx for both the main website (scipy.org) and the tutorials in favor of a more versatile web layout. Designing two distinct layouts for scipy.org and docs.scipy.org will help clear up the confusion between the ecosystem and the package.

While reworking the container for the content would form a good portion of the GSoD project, I would also work on the content on the website. The content breaks down into tutorial pages and surrounding pages. For the tutorials, I would highlight the basic usage front and center to get users up and going rapidly. Then I would want to focus on explaining what the numerical method accomplishes and what is possible beyond basic usage. Tutorials already exist, but editing could make them better. Reworking the content on the main pages would help with the navigational and structure problems discussed above.

If it sounds exciting to work with organizations like NumPy and SciPy, just do it! Don’t wait! People get really overwhelmed at the idea of working on the code for an open-source organization. But there’s more going on than just the code. You can’t imagine how helpful it can be to have someone step in on the documentation side.

If you want to get involved with open-source projects, get involved. If you love to write (or you love to work on the writing other people have done), get in there and work your magic! It’s up to everyone to make the tech world an even more amazing place than it already is.

If you’re into data science, machine learning, artificial intelligence, or technology in general, then you’ve seen some documentation. If you’re having trouble understanding some of it, don’t sit back and wish things were different. Get in there and help.

Make a difference!

You might get to learn something new. You might even get to meet some incredibly cool people!

Photo by Tim Mossholder from Pexels

If you want to contribute to open-source organizations but don’t know how to use GitHub, check out this article:

Getting started with Git and GitHub: the complete beginner’s guide
Git and GitHub basics for the curious and completely confused

Thanks for reading! As always, if you do anything cool with this information, let everyone know about it in the comments below!

The post NumPy and SciPy and Google Season of Docs, Oh My: Meet Christina Lee appeared first on Content Simplicity.

What do You Want to See in the NumPy Docs?

Anne B — Thu, 19 Sep 2019 00:12:55 +0000

Behind the scenes at NumPy and SciPy with Google Season of Docs

Season of Docs has begun!!!

What is Google Season of Docs?

Google did an amazing thing by creating Season of Docs. It built real opportunities for technical writers to collaborate with open source organizations.

Season of Docs is a three-month mentoring program. It pairs technical writers with open source organizations. Writers have the opportunity to work with well-known and highly-regarded organizations. Open source organizations (who often don’t have a budget for technical writers) have the opportunity to work with experienced technical writers to improve and expand their existing documentation.

It’s pretty incredible.

The goal of Season of Docs is to provide a framework for technical writers and open source projects to work together towards the common goal of improving an open source project’s documentation. For technical writers who are new to open source, the program provides an opportunity to gain experience in contributing to open source projects. For technical writers who’re already working in open source, the program provides a potentially new way of working together. Season of Docs also gives open source projects an opportunity to engage more of the technical writing community.
During the program, technical writers spend a few months working closely with an open source community. They bring their technical writing expertise to the project’s documentation, and at the same time learn about the open source project and new technologies.
The open source projects work with the technical writers to improve the project’s documentation and processes. Together they may choose to build a new documentation set, or redesign the existing docs, or improve and document the open source community’s contribution procedures and onboarding experience.
Together, we raise public awareness of open source docs, of technical writing, and of how we can work together to the benefit of the global open source community.
~Introduction to Google Season of Docs

It’s a win-win!

Photo by It’s me, Marrie from Pexels

What is NumPy?

At it’s most basic level, NumPy is numeric, or numerical (Num) Python (Py).

From the official documentation:

“NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.”

It’s a hugely important open source Python library. It’s the core library for scientific computing in Python. It’s useful in data science, machine learning, deep learning, artificial intelligence, computer vision, science, engineering, and more. It adds support for large, multi-dimensional arrays and matrices and a huge collection of high-level mathematical functions that can operate on the arrays.

The ancestor of NumPy (Numeric) was originally created by Jim Hugunin. By 2000, interest in creating a complete environment for scientific and technical computing was growing. In 2001, Travis Oliphant, Eric Jones, and Pearu Peterson merged code they had written and called the resulting package SciPy. In 2005, Travis Oliphant created NumPy. He did this by incorporating features of Numarray into Numeric with tons of modifications. In early 2005, he wanted to unify the community around a single array package. As a result, he released NumPy 1.0 in 2006. This project was part of SciPy. To avoid installing the large SciPy package just to get an array object, this new package was separated and called NumPy.

What’s SciPy?

It’s scientific (Sci) Python (Py)! SciPy is a free and open source Python library. It’s used for scientific computing and technical computing. It contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering. SciPy uses NumPy arrays as the basic data structure. It has modules for various commonly used tasks in scientific programming. These tasks include integration (calculus), ordinary differential equation solving, and signal processing.

SciPy builds on the NumPy array object. It’s part of the NumPy stack. The stack includes tools like Matplotlib, Pandas, and SymPy, and an expanding set of scientific computing libraries. Its users come from all fields of science, engineering and beyond. Python has one of the largest, if not the largest, scientific user communities. Similar communities are R, Julia and Matlab.

Still with me?

Photo by Passerina from Pexels

The Process

Google announced Season of Docs in March 2019. In April, open source organizations had the opportunity to apply to be a part of the program. Google announced the selected organizations on April 30. Technical writers were able to look over the list of 45 organizations and choose projects that interest them. They could submit up to three project proposals. From May 29-June 28, technical writer applications were open! After the application deadline was over, each organization selected the technical writing projects that they were interested in mentoring.

On August 6, Google announced the accepted writing projects!

The program received more than 700 technical writing project proposals from nearly 450 technical writers. Each organization was able to select one technical writer for an approved project. The NumPy/SciPy team, however, decided to go above and beyond by securing funding for an additional three writers outside of Season of Docs. The team believes so strongly in moving their documentation forward that they found additional funding. This allowed them to include three more writers under the same conditions as Season of Docs.

Where did the funding come from?

NumPy received two grants that are kind of a package deal (you can read about them here and here). Funds were awarded by the Moore and Sloan foundations for $1.3M to the Berkeley Institute of Data Science (BIDS) to support the development of NumPy. The funding period runs from April 2018 to Oct 2020. (Stéfan van der Walt, a NumPy Steering Council member, agreed to provide the funds from that grant.)

Ralf Gommers, one of the core programmers behind NumPy and SciPy and the Director of Quansight Labs, is the point of contact for both organizations. Ralf is an incredible person, and he had this to say about Season of Docs:

“When I first saw the Season of Docs announcement, I loved the idea of the program — working with a tech writer would be both an interesting new experience for me personally, and potentially massively beneficial to NumPy and SciPy. So I spent a lot of effort on both writing a very engaging ideas page, and then following up with writers that showed interest. I probably had ~10 video calls, and many more email threads.
Then, it turned out that there was a lot of interest, and the quality of applicants and proposals was really high. I started thinking about how to not only get one or two 3-month projects running, but how to engage these writers in a way that would make them enjoy the experience enough to stay around after the project. One thing that came to mind was that people like working with like-minded others. However, we don’t yet have technical writers — adding one to NumPy and one to SciPy may not be enough. So I decided to start building a documentation team. The ideas and people were there, so next what’s needed is funding.
NumPy has a significant active grant, so I discussed the possibility of using some of that grant funding for the extra Season of Docs projects with Stéfan. Stéfan is awesome, and he also sees the value of both the proposed projects and of building a team of writers. So he agreed to reserve some funds for this purpose. So here we are today — excited to get started!”
~Ralf Gommers

Who are the writers?

The writers selected for the NumPy/SciPy documentation projects are amazing, and you need to know who they are!

Maja Gwozdz

The official technical writer selected by SciPy during Season of Docs is Maja Gwozdz. Her project proposal is called “User-oriented documentation and thorough restructuring.” You can read all about it here, but essentially, Maja intends to work on the refactoring of the existing documentation, so that it would be easily accessible by users with different needs.

Anne Bonner

Yours truly (yay!) was the official selection for NumPy, with the project proposal, “Making ‘The Basics’ a Little More Basic: Improving the Introductory NumPy Sections.” Since there’s nothing that makes me happier than helping beginners understand complex information and technologies, NumPy is the perfect challenge!

I’m excited to dig into the introductory NumPy materials to create something more accessible for people with little or no experience. NumPy is in such an interesting position: it’s incredibly complex, but it’s also one of the most important libraries for beginners who are interested in working with data. I’ll be creating beginner-level documentation of basic concepts in NumPy that can function as a stepping stone for people who want to use NumPy, not necessarily study it.

Shekhar Rajak

Shekhar Rajak was selected for “Numpy.org redesign and high-level documentation restructuring for end-user focus.” His goals for the project include:

Designing and developing better UI for www.numpy.org
Enhancing and modifying the contents of www.numpy.org: NumPy User Guide, NumPy Benchmarking, F2Py Guide, NumPy Developer Guide, Building and Extending the Documentation, NumPy Reference, About NumPy, Reporting bugs and all other related to Development pages.
Adding contents about when to use NumPy and when to use XND, Dask array Python libraries, which provides similar APIs.
Preserving the Python API documentation.

Brandon David

Brandon David was selected for his project “Improve the documentation of scipy.stats.” Brandon plans to fill out missing functions as well as add examples and internal links. His goal is to clear up ambiguity and work through issues on GitHub.

Christina Lee

Christina Lee was selected for her proposal, “SciPy documentation: Design, Usability and Content.” She is a recent addition, and I’m looking forward to sharing her work with you soon!

Harivallabha Rangarajan

Harivallabha Rangarajan is planning to contribute to the documentation and complement the work of the writers selected for Season of Docs in any way he can. He’s particularly interested in writing end-to-end tutorials for the scipy.stats module. He writes that “having more comprehensive tutorials will help users get a better idea of how and where the available methods may be used in the pipeline.”

Welcome to Season of Docs!!!

It’s incredible to be involved in the inner workings of NumPy and SciPy. So far, we’ve been joining meetings with the team, getting to know the core players, and learning the workflow. I can’t wait to keep you guys updated with our projects as they develop!

Photo by Pineapple Supply Co. from Pexels

Get involved!

Now that you know the major players on the writing side, don’t be afraid to reach out and let us know if there’s information you want to see in the official documentation! Who knows, we might just be able to give you what you want to see.

If the idea of getting involved with open-source organizations interests you, get in there and start sharing! Don’t wait for an invitation. Start contributing now! It’s up to everyone to make the tech world an even more amazing place than it already is.

If you’re interested in contributing to open-source organizations but have no idea how to get started with GitHub, you might want to check out this article:

Getting started with Git and GitHub: the complete beginner’s guide
Git and GitHub basics for the curious and completely confused

Thanks for reading! As always, if you do anything cool with this information, let everyone know about it in the comments below!

The post What do You Want to See in the NumPy Docs? appeared first on Content Simplicity.

How to Effortlessly Connect OBIEE to Tableau 2019.2

Anne B — Mon, 09 Sep 2019 05:00:00 +0000

Image by Evren Ozdemir from Pixabay

Are you frustrated with how difficult it is to visualize your OBIEE data the way you want to? Do you wish that your OBIEE data could simply and securely connect with Tableau?

It can! BI Connector allows you to securely access and use your OBIEE and Taleo data right in Tableau, PowerBI, and Qlik.

If you want to use visualizations to communicate the results of your analysis, you probably want to work with modern, easy-to-use visual analytics tools like Tableau. But there’s no simple way to do this with your OBIEE data! Making it happen manually means wasting an average of 4–5 hours per week creating and using exports and imports of OBIEE data into a visualization tool. That can translate to tens of thousands of dollars wasted. With BI Connector, you can easily connect to OBIEE subject areas and reports in minutes. You simply log in with your OBIEE credentials, reusing your existing OBIEE business logic. You don’t have to make any changes to OBIEE.

You save time and money.

BI Connector is perfect for anyone interested in machine learning, data analysis, data visualization, business analysis, data science, and predictor analysis. It’s easy enough for beginners to use and it offers benefits that the most advanced power users will enjoy. It’s the number one BI integration solution used by enterprise customers. It’s simple, secure, and efficient. Setup takes less than five minutes. You can run direct query-based live connections to both subject areas and reports and immediately use your results to create gorgeous, responsive, and intuitive visualizations. It allows you to make faster decisions and avoid common errors.

It will save you an incredible amount of time and effort.

Photo by nickgesell via Pixabay

BI Connector is hands-down the easiest way to connect the capabilities of OBIEE with the intuitive visualizations of Tableau. It allows you to take the power and security of OBIEE and effortlessly combine it with everything your favorite visualization tool has to offer.

OBIEE

Image labeled for reuse via Wikimedia Commons

If you’re serious about data, there’s a very good chance that you’re already using OBIEE. It’s amazing for data reporting and intelligence. It can hold a huge volume of data and it’s perfect for medium and large enterprises. It also handles complex structures extremely well. Plus Oracle has developed pre-defined BI solutions that are available in OBIEE. When you create a BI solution in OBIEE, that solution is implemented immediately. It offers interactive dashboards and reporting, actionable intelligence, proactive detection and alerts, Microsoft Office integration, and a lot more.

That being said, there are simply not as many visualization options available in OBIEE as there are in Tableau. The options that are available are not as user-friendly as the ones in Tableau. OBIEE also has a limited ability to work with other tools and often requires the purchase of an extra license to use them.

OBIEE also requires a significant amount of education to use properly and it’s not as easy to connect your OBIEE data to Tableau as it could be. You can do it, of course, but solutions like creating Excel exports or SQL scripts wastes time. Wasted time is wasted money. (It’s also worth noting that exporting your data and then importing it puts your data at risk of unauthorized access.)

Tableau

Image labeled for reuse via Wikipedia

Tableau, on the other hand, offers gorgeous and easy-to-create visualizations. It’s well-suited for small and medium enterprises. It allows for the use of a number of different tools. It’s intuitive and user-friendly, offering simple drag-and-drop responsive charts, one-click formulas, filters, and a lot more. There are also some fun new features in 2019.2, including vector-based maps! You can check out the new features here. But while Tableau can handle a lot of data, it can’t manage the huge volume of data that OBIEE can handle with ease. It’s very challenging to use when you have more than 25 tables or more than 16 get columns. And everything in Tableau needs to be developed from scratch.

These two tools aren’t replacements for each other. They can do impressive work together! But getting them to work together can be challenging. If you decide to export your data from OBIEE and then import it into Tableau, you wind up duplicating data and duplicating logic as well. Your results might be inconsistent, and you expose yourself to potential security risks.

That’s where BI Connector comes in!

BI Connector is the fun, simple, and secure way to connect OBIEE and Tableau. With BI Connector, you can create your visualizations in Tableau using your OBIEE data in no time. BI Connector uses the OBIEE security model, so your data is protected. It automates the process of moving your OBIEE data into Tableau and it keeps your data safe.

BI Connector is great for everyone on your team from IT directors and analysts to human resources, and even your sales and marketing team. It sits right on top of the OIBEE layer and allows you to easily integrate your favorite data visualization tool. It’s simple and intuitive and it bridges the gap between technologies, saving you an incredible amount of time and money. You get plug-and-play access for your users from what they’ve already built in OBIEE. You can run your results right from your data warehouse. It’s also fun and secure! You can create your visualizations in minutes while protecting your data with the OBIEE security model.

If you’ve already invested thousands of dollars in your data tools, why waste any time trying to get those tools to work together? BI Connector is the simple way to connect your tools. You don’t have to make any changes to OBIEE or to Tableau. Plus, you’re working with a tested and secure corporate data warehouse. You get self-service data visualization with the security and governance of OBIEE.

How to Use BI Connector to connect OBIEE and Tableau

BI Connector is incredibly easy to install. It takes less than five minutes. You can find a helpful step-by-step guidehere and really helpful videos herethat will walk you through the process if you run into any issues.

Step 1: Download

First, you’ll need to go to the BI Connector websiteto download BI Connector. You can click the button that says “Try it free” and drop down the menu to “Visualize OBIEE data in” and click on “Tableau.”

Enter your information, click the button that says “Try BI Connector for Tableau” and your download will start. There’s no credit card or commitment required. You have 30 days to try it out and see what a game-changer it is.

You’ll double click on the “BIConnector-Desktop-Edition-x64-Tableau.exe” file to unzip it and specify the location and then start the installation. You’ll click on the button that says “Install” and follow the prompts.

At the end of the installation process, you’ll see a popup window that lets you know that your installation has been completed. Make sure the box is checked next to the line that says “Launch ODBC Administrator” and click “ Finish.”

Step 2: ODBC Administrator and License Activation

Next, you’ll want to create a new data source. Go to the ODBC Database Administrator and click “Add.” A box will pop up where you can activate your 30 Day trial license. Enter your personal information and then either leave the trial license number as-is (it will show up automatically) or change it to the key for the license you purchased and click “Activate.” This will take you back to the window where you will now be able to create a new data source.

You’ll click on the button that says “Add.”

You’ll need to enter the data source name (OBIEE Connect here) and your server name, port, user ID, and password. This is the information you use for your Oracle BI server.

If you log in to OBIEE with something like “http://obieefunstuff.websitename.com:9704/analytics” then your server name will be “http://obieefunstuff.websitename.com” and your port will be 9704. Your user ID and password are your OBIEE user ID and password. It’s a good idea to take a look at the official step-by-step guidefor more information. It walks you through some common scenarios, including what to do if you don’t see your port number in the URL.

Now click “Test connection” to make sure everything is working.

Step 3: Configure access for Subject Areas or Reports

In the next window, you’ll see two little radio buttons that allow you to select either “Subject Areas” or “Reports.” Go ahead and choose one or the other and hit “Save.” You can always go to this screen again any time you want to change or update your information.

Step 4: configure Tableau

Now it’s time to head over to Tableau! Launch Tableau, then click “More Servers” and then go to “Other Databases (ODBC).” Now you can select the data source that you created and click “Connect.” Make sure your server name, port, and User ID are exactly the same as the ones you used previously and enter the same password here as well. Test your connection and hit “OK.” You’ll click “OK” again on the “Other Databases (ODBC) screen.

You’re ready to start visualizing your data!

Go ahead and select your database from the dropdown menu and click “OK.”

Done

That’s it! You’re all set!

You can drag and drop your information just like you normally would! Easily create filters, add color, labels, tooltips, and everything else you normally want to do with Tableau. You have your data at your fingertips and you can work with it in exactly the way you want to.

You can even blend OIBEE data with other data files right in Tableau in exactly the same way that you normally would. Start with the same process, then click “Add” in the upper left corner or drop your Data menu down to “New Data Source.” Select “Text File,” for example, if you have data in a CSV file. That will open a window where you can select your file.

Now you’re all set to work with your additional data!

It’s really that easy. You’re minutes away from being able to combine the power of OBIEE with the ease and appeal of Tableau.

What are you waiting for? Give BI Connector a try today!

The post How to Effortlessly Connect OBIEE to Tableau 2019.2 appeared first on Content Simplicity.

What is Deep Learning and How Does it Work?

Anne B — Sat, 07 Sep 2019 07:01:00 +0000

Photo by Chevanon Photography from Pexels

Sit back, relax, and get comfortable with cool concepts like neural networks, gradient descent, backpropagation, and more.

What is deep learning?

It’s learning from examples. That’s pretty much the deal.

At a very basic level, deep learning is a machine learning technique. It teaches a computer to filter inputs through layers to learn how to predict and classify information. Observations can be in the form of images, text, or sound.

The inspiration for deep learning is the way that the human brain filters information. Its purpose is to mimic how the human brain works to create some real magic.

It’s literally an artificial neural network.

In the human brain, there are about 100 billion neurons. Each neuron connects to about 100,000 of its neighbors. We’re kind of recreating that, but in a way and at a level that works for machines.

In our brains, a neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and transfers to the dendrites of the next neuron. That connection where the signal passes is called a synapse.

Image by mohamed_hassan on Pixabay

Neurons by themselves are kind of useless. But when you have lots of them, they work together to create some serious magic. That’s the idea behind a deep learning algorithm! You get input from observation and you put your input into one layer. That layer creates an output which in turn becomes the input for the next layer, and so on. This happens over and over until your final output signal!

The neuron (node) gets a signal or signals (input values), which pass through the neuron. That neuron delivers the output signal.

Think of the input layer as your senses: the things you see, smell, and feel, for example. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. You’ll need to either standardize or normalize these variables so that they’re within the same range.

They use many layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output of the previous layer for its input. What they learn forms a hierarchy of concepts. In this hierarchy, each level learns to transform its input data into a more and more abstract and composite representation.

Image by ahmedgad on Pixabay

That means that for an image, for example, the input might be a matrix of pixels. The first layer might encode the edges and compose the pixels. The next layer might compose an arrangement of edges. The next layer might encode a nose and eyes. The next layer might recognize that the image contains a face, and so on.

What happens inside the neuron?

The input node takes in information in a numerical form. The information is presented as an activation value where each node is given a number. The higher the number, the greater the activation.

Each of the synapses gets assigned weights, which are crucial to Artificial Neural Networks (ANNs). Weights are how ANNs learn. By adjusting the weights, the ANN decides to what extent signals get passed along. When you’re training your network, you’re deciding how the weights are adjusted.

The activation runs through the network until it reaches the output nodes. The output nodes then give us the information in a way that we can understand. Your network will use a cost function to compare the output and the actual expected output. The model performance is evaluated by the cost function. It’s expressed as the difference between the actual value and the predicted value. There are many different cost functions you can use, you’re looking at what the error you have in your network is. You’re working to minimize loss function. (In essence, the lower the loss function, the closer it is to your desired output). The information goes back, and the neural network begins to learn with the goal of minimizing the cost function by tweaking the weights. This process is called backpropagation.

In forward propagation, information is entered into the input layer and propagates forward through the network to get our output values. We compare the values to our expected results. Next, we calculate the errors and propagate the info backward. This allows us to train the network and update the weights. (Backpropagation allows us to adjust all the weights simultaneously.) During this process, because of the way the algorithm is structured, you’re able to adjust all of the weights simultaneously. This allows you to see which part of the error each of your weights in the neural network is responsible for.

When you’ve adjusted the weights to the optimal level, you’re ready to proceed to the testing phase!

Photo by Yogendra Singh from Pexels

How does an artificial neural network learn?

There are two different approaches to get a program to do what you want. First, there’s the specifically guided and hard-programmed approach. You tell the program exactly what you want it to do. Then there are neural networks. In neural networks, you tell your network the inputs and what you want for the outputs, and then you let it learn on its own.

By allowing the network to learn on its own, you can avoid the necessity of entering in all of the rules. You can create the architecture and then let it go and learn. Once it’s trained up, you can give it a new image and it will be able to distinguish output.

Feedforward and feedback networks

A feedforward network is a network that contains inputs, outputs, and hidden layers. The signals can only travel in one direction (forward). Input data passes into a layer where calculations are performed. Each processing element computes based upon the weighted sum of its inputs. The new values become the new input values that feed the next layer (feed-forward). This continues through all the layers and determines the output. Feedforward networks are often used in, for example, data mining.

A feedback network (for example, a recurrent neural network) has feedback paths. This means that they can have signals traveling in both directions using loops. All possible connections between neurons are allowed. Since loops are present in this type of network, it becomes a non-linear dynamic system which changes continuously until it reaches a state of equilibrium. Feedback networks are often used in optimization problems where the network looks for the best arrangement of interconnected factors.

What is a weighted sum?

Inputs to a neuron can either be features from a training set or outputs from the neurons of a previous layer. Each connection between two neurons has a unique synapse with a unique weight attached. If you want to get from one neuron to the next, you have to travel along the synapse and pay the “toll” (weight). The neuron then applies an activation function to the sum of the weighted inputs from each incoming synapse. It passes the result on to all the neurons in the next layer. When we talk about updating weights in a network, we’re talking about adjusting the weights on these synapses.

Stochastic Gradient Descent

A neuron’s input is the sum of weighted outputs from all the neurons in the previous layer. Each input is multiplied by the weight associated with the synapse connecting the input to the current neuron. If there are 3 inputs or neurons in the previous layer, each neuron in the current layer will have 3 distinct weights: one for each synapse.

In a nutshell, the activation function of a node defines the output of that node.

The activation function (or transfer function) translates the input signals to output signals. It maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents the rate of action potential firing in the cell. It’s a number that represents the likelihood that the cell will fire. At it’s simplest, the function is binary: yes(the neuron fires) or no(the neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be anywhere in a range. If you were using a function that maps a range between 0 and 1 to determine the likelihood that an image is a cat, for example, an output of 0.9 would show a 90% probability that your image is, in fact, a cat.

What is an activation function?

In a nutshell, the activation function of a node defines the output of that node.

The activation function (or transfer function) translates the input signals to output signals. It maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents the rate of action potential firing in the cell. It’s a number that represents the likelihood that the cell will fire. At it’s simplest, the function is binary: yes (the neuron fires) or no (the neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be anywhere in a range.

What options do we have? There are many activation functions, but these are the four very common ones:

Threshold function

This is a step function. If the summed value of the input reaches a certain threshold the function passes on 0. If it’s equal to or more than zero, then it would pass on 1. It’s a very rigid, straightforward, yes or no function.

Example threshold function

Sigmoid function

This function is used in logistic regression. Unlike the threshold function, it’s a smooth, gradual progression from 0 to 1. It’s useful in the output layer and is used heavily for linear regression.

Example sigmoid function

Hyperbolic Tangent Function

This function is very similar to the sigmoid function. But unlike the sigmoid function which goes from 0 to 1, the value goes below zero, from -1 to 1. Even though this isn’t a lot like what happens in a brain, this function gives better results when it comes to training neural networks. Neural networks sometimes get “stuck” during training with the sigmoid function. This happens when there’s a lot of strongly negative input that keeps the output near zero, which messes with the learning process.

Example hyperbolic tangent function (tanh)

Rectifier function

This might be the most popular activation function in the universe of neural networks. It’s the most efficient and biologically plausible. Even though it has a kink, it’s smooth and gradual after the kink at 0. This means, for example, that your output would be either “no” or a percentage of “yes.” This function doesn’t require normalization or other complicated calculations.

Example rectifier function

What?

So let’s say, for example, your desired value is binary. You’re looking for a “yes” or a “no.” Which activation function do you want to use?

From the above examples, you could use the threshold function or you could go with the sigmoid activation function. The threshold function would give you a “yes” or “no” (1 or 0). The sigmoid function would be able to give you the probability of a yes.

If you were using a sigmoid function to determine how likely it is that an image is a cat, for example, an output of 0.9 would show a 90% probability that your image is, in fact, a cat.

Photo by minanafotos on Pixabay

Want to dive deeper? Check out Deep Sparse Rectifier Neural Networksby Xavier Glorot, et al.

How do you adjust the weights?

You could use a brute force approach to adjust the weights and test thousands of different combinations. But even with the most simple neural network that has only five input values and a single hidden layer, you’ll wind up with 10⁷⁵ possible combinations.

Running this on the world’s fastest supercomputer would take longer than the universe has existed so far.

Enter gradient descent

But if you go with gradient descent, you can look at the angle of the slope of the weights and find out if it’s positive or negative in order to continue to slope downhill to find the best weights on your quest to reach the global minimum.

If you go with gradient descent, you can look at the angle of the slope of the weights and find out if it’s positive or negative. This allows you to continue to slope downhill to find the best weights on your quest to reach the global minimum.

Photo by RANJAN SIMKHADA from Pexels

Unfortunately, this tool takes forever.

She wants to use it as infrequently as she can to get down the mountain before dark. The real difficulty is choosing how often she wants to use her tool so she doesn’t go off track.

In this analogy, the person is the algorithm. The steepness of the hill is the slope of the error surface at that point. The direction she goes is the gradient of the error surface at that point. The tool she’s using is differentiation (the slope of the error surface can be calculated by taking the derivative of the squared error function at that point). The rate at which she travels before taking another measurement is the learning rate of the algorithm. It’s not a perfect analogy, but it gives you a good sense of what gradient descent is all about. The machine is learning the gradient, or direction, that the model should take to reduce errors.

Gradient descent requires the cost function to be convex, but what if it isn’t?

Normal gradient descent will get stuck at a local minimum rather than a global minimum, resulting in a subpar network. In normal gradient descent, we take all our rows and plug them into the same neural network, take a look at the weights, and then adjust them. This is called batch gradient descent. In stochastic gradient descent, we take the rows one by one, run the neural network, look at the cost functions, adjust the weights, and then move to the next row. Essentially, you’re adjusting the weights for each row.

Stochastic gradient descent has much higher fluctuations, which allows you to find the global minimum. It’s called “stochastic” because samples are shuffled randomly, instead of as a single group or as they appear in the training set. It looks like it might be slower, but it’s actually faster because it doesn’t have to load all the data into memory and wait while the data is all run together. The main pro for batch gradient descent is that it’s a deterministic algorithm. This means that if you have the same starting weights, every time you run the network you will get the same results. Stochastic gradient descent is always working at random. (You can also run mini-batch gradient descent where you set a number of rows, run that many rows at a time, and then update your weights.)

Many improvements on the basic stochastic gradient descent algorithm have been proposed and used, including implicit updates (ISGD), momentum method, averaged stochastic gradient descent, adaptive gradient algorithm (AdaGrad), root mean square propagation (RMSProp), adaptive moment estimation (Adam), and more.

So here’s a quick walkthrough of training an artificial neural network with stochastic gradient descent:

Randomly initiate weights to small numbers close to 0
Input the first observation of your dataset into the input layer, with each feature in one input node.
Forward propagation — from left to right, the neurons are activated in a way that each neuron’s activation is limited by the weights. You propagate the activations until you get the predicted result.
Compare the predicted result to the actual result and measure the generated error.
Backpropagation — from right to left, the error is back propagated. The weights are updated according to how much they are responsible for the error. (The learning rate decides how much we update the weights.)
Reinforcement learning(repeat steps 1–5 and update the weights after each observation) ORbatch learning(repeat steps 1–5, but update the weights only after a batch of observations).
When the whole training set has passed through the ANN, that is one epoch. Repeat with more epochs.

There you have it! Those are the basic ideas behind what’s happening in an artificial neural network.

Congratulations! Now you know what deep learning is and how it works!

Hungry for more? You might want to read Efficient BackPropby Yann LeCun, et al., as well as Neural Networks and Deep Learningby Michael Nielsen.If you’re interested in learning more about cost functions, check outA List of Cost Functions Used in Neural Networks, Alongside Applications.

You might also want to check out this one:

Convolutional Neural Networks and Image Classification

Thanks for reading! As always, if you do anything cool with this information, leave a comment in the notes below or reach out on LinkedIn @annebonnerdata.

The post What is Deep Learning and How Does it Work? appeared first on Content Simplicity.

Your Mobile Banking App has a Problem

Anne B — Sun, 02 Jun 2019 20:24:45 +0000

Errors in machine learning algorithms are creating critical (and nearly invisible) consequences

(This article first appeared in Towards Data Science)

Do you use a mobile banking app?

We have almost certainly paid thousands, if not millions, of dollars for returned checks that aren’t actually bad.

I’m not sure if anyone is aware of it.

It is impossible that I’m the only person that this has happened to.

But it’s easy to see how we might all be missing it.

Photo by Ryoji Iwata on Unsplash

The technology behind mobile banking is pretty incredible, but what happens when there’s a mistake?

What happens if we don’t see the mistake?

We’re living in a world where so many technological advancements have been made that they almost blend into the background. We’ve gotten used to the idea that we can let our phones and computers do the little things for us. It’s easy to forget how new all of this technology really is.

But it is new. It’s changing every day. There are algorithms behind most of the basic things that you take for granted, from social media and entertainment to banking and finances. They are constantly evolving.

They are not perfect.

Pay attention!

If an image capture system makes an error within your banking app that causes your deposit to be rejected, what will that cost you?

What if no one sees it? What will that cost us all?

This is not a bad check!

Recently, I deposited a check on a mobile banking app that was accepted, only to be returned as a bad check a few days later. I was charged a fee for this.

Here’s the problem: that check was not bad.

Here’s the bigger problem: this took almost two weeks to sort out and involved a huge amount of wasted time. It was practically by chance that I even caught the actual issue. There was an error in the image capturing system within a mobile banking app. An easy-to-miss mistake that may have already happened to you without you even being aware of it.

There are a lot of perks to working as a freelancer, but the financial side can get complicated. Rather than a steady stream of checks from a single source, you’re depending on a variety of clients to send you payments from different accounts on what can be a pretty random schedule.

I’m lucky enough to have amazing clients who I trust completely. Not that mistakes can’t happen! They absolutely can. People lose track of their balances, grab checks from the wrong account or a closed account, and so on. But I’ve been incredibly fortunate to have never yet had a client send a bad check.

You can imagine my surprise when, days after having a mobile deposit approved through the banking app for a major bank, that check was returned and my account was hit with a sizable fee. This is a client I’ve known and trusted for years, and it didn’t make sense. I reached out to the client, and he was as baffled by the situation as I was.

I contacted the bank and spent a very long time on the phone with someone who assured me that there was nothing that she could do. It was a bad check. When someone writes you a bad check, you have to pay a fee. You’re expected to have your client reimburse you for that fee. End of story. Your bank can’t possibly go to some other bank and demand that they pay the fee, so it’s up to you. Just go back to your client and get a new check.

Sounds legit, right?

Bad news, banker

Unfortunately for her, I wasn’t going to let this go. Here’s where I have the advantage over a lot of other people out there who might take what she’s saying at face value. People who would simply return to the client and request a new check. One with added charges to cover the fee and possibly interest as well:

I know and trust my client.
I was holding the check in my hand and could see that it was drawn on an account from this same bank. (This suggested that some information that she had was incorrect.)
I am familiar with some of the algorithms driving image capture and classification as well as their potential weak spots. (I actually write about technology and artificial intelligence here on Medium.)
I’m aware that this bank recently did a major tech upgrade.

I used to be a personal banker with this same bank. It was a long time ago, but I know what information a banker has access to and what steps she can and can’t take.

It’s unlikely that you are in this position and that’s why I’m writing this story.

How many people do think have simply gone back after depositing a check on a banking app, gotten a new check, and paid the fee without identifying the actual problem?

You will believe your banker. You will believe that you received a bad check and proceed from there. Your client will need to provide another check with additional fees. It could affect your relationship with them. The returned check may cause you to overdraw your account, incurring more fees and much larger problems. Multiple returned checks in a short period can cause you to lose your account. A lot of things can go very badly here, all because of an error in a machine learning algorithm.

I want you to have this information. I want you to know what you’re looking for. You can and should ask questions. Is the problem really with the banking app and not with the check? There are a lot of things a banker can’t tell you but plenty of information is available to you. Was the check returned because of insufficient funds? Is this an account that doesn’t exist? What other steps can you take?

What is really going on here?

Ask the questions!

Photo by Artem Maltsev on Unsplash

So what happened?

That was surprisingly hard to figure out. I’m sure the banker on the phone hopes she never hears from me again. Eventually, it turned out that the account that the check was drawn on was not the account number entered into the system. That means that the bank couldn’t locate the bank account in question.

I had to wait for the bank to send me a certified copy of my deposit so that I could take that and the original check to a bank for a banker to examine in person. At that point, the banker could try to figure out the problem and determine whether or not to reverse the fee. (For added fun, your fee can generally only be reversed by the bank where you opened your account. That means that I’d have to wait for a banker I’ve never spoken with halfway across the country to decide whether to reverse my fee.)

So, I waited for the mail.

It took close to a week. It’s not hard to imagine what this would be like for someone who was now overdrawn on their account because of a mistake entirely on the bank’s end.

Take it to the bank

Once the check arrived and my toddler could be spared from the world’s most boring adventure, I headed over to the bank and went through all of this again in person.

This banker said once again that this was a bad account. The customer definitely existed, but the account wasn’t an open account. It must be one that had been closed and my client had grabbed some old checks by mistake. I’d just have to get a new check.

Easy mistake. It could happen to anyone.

This was all so plausible, but I know my client. I also know routing numbers. This guy was not the sort of person who would be holding on to a box of checks from a closed account that he opened in a state where he lived seven years ago. Not happening.

Somehow, even though the banker knew with absolute certainty that she was in the right, she looked again.

She saw the problem. It was so simple.

The image capture system that the banking app uses had cut off the last two digits of the account number on the check.

That’s it.

That mistake, both tiny and massive, caused a returned check, a fee, and nearly two week’s worth of headaches and wasted time that could have been productive.

It could have been far worse.

That bank error was practically invisible.

Enter artificial intelligence

So where does this image capture system come from? Is it unique to this bank?

It turns out that most of the major banks all use the same company for the image capture, recognition, and analysis systems within their banking apps. This company does incredible work and I’m in no way questioning that. I have zero proof that they are directly the cause of this issue.

The company uses artificial intelligence to develop algorithms for image recognition. They’re using machine learning algorithms to do incredible things with document and ID verification. They’ve created an image capture software development system built on computer vision and machine learning algorithms. It detects corners and glare, can detect and analyze images on a variety of backgrounds, contains built-in analytics, offers real-time image assessment, and has a lot of other cool features.

But it’s not, apparently, flawless.

That said, I don’t believe that this company is directly the cause of the problem. I know that the bank I use for my freelance work has recently undergone a major tech redesign and they’ve made changes to their online and mobile banking app. It might be possible that the redesign on the bank’s end caused a problem with the image capture system.

It also might be possible that there is a problem with the image capture system itself. I’m having a hard time locating any records of errors on the tech company’s part, but that doesn’t necessarily mean that there aren’t any. They might have an amazing PR department or a stellar legal team. Or there might not be a lot of other people who happen to be in a position to notice exactly what happened here.

I have no way of knowing where the fault lies or how often this happens. But it’s impossible that this has only happened one time.

This is potentially a massive problem. Even if 0.1% of customers who use mobile banking apps are having (or will have) this issue, that’s a huge problem within our economy. An enormous number of people use mobile banking apps and that number is growing. The company that makes this technology also builds technology that’s used in ID and document scanning. Can it just drop numbers? These are the numbers that represent our bank accounts and our identities. This kind of mistake is not acceptable and will have extremely serious ramifications.

What’s the solution?

I don’t know yet, but I know we need one. We definitely need to start training bankers to watch for this problem. Banking apps are not perfect. Remember that all of this technology is still new. Remember that you have the right to ask questions. You have the right to get to the bottom of the situation, even when people tell you that you are already there. This is almost certainly happening everywhere and we need to find a way to fix this problem.

It’s up to all of us to pay attention. No one is going to solve this problem if they don’t know about it.

Don’t let this slide. It’s too important.

If anyone else has had the same issue, feel free to discuss it in the comments below. As always, reach out any time on LinkedIn @annebonnerdata.

The post Your Mobile Banking App has a Problem appeared first on Content Simplicity.

Multiple Linear Regression in 4 Lines of Code!

Anne B — Thu, 30 May 2019 20:25:08 +0000

Conquer the basics of multiple linear regression (and backward elimination!) and use your data to predict the future!

(This article first appeared in Towards Data Science)

Multiple linear regression is a lot of fun. Being able to predict the future is awesome.

You might want to predict how well a stock will do based on some other information that you just happen to have.

It might help you to know if how often you bathe and how many cats you have relates to how long you’ll live.

You might want to figure out if there’s a relationship between a man who 1.) calls his mom more than three times a day, 2.) refers to another man as “bro,” 3.) has never done his own laundry and above-average divorce rates.

Multiple linear regression might be for you!

Multiple linear regression is fun because it looks at the relationships within a bunch of information. Instead of just looking at how one thing relates to another thing (simple linear regression), you can look at the relationship between a lot of different things and the thing you want to predict.

A linear regression model is a statistical model that’s frequently used in data science. It’s also one of the basic building blocks of machine learning! Multiple linear regression (MLR/multiple regression) is a statistical technique. It can use several variables to predict the outcome of a different variable. The goal of multiple regression is to model the linear relationship between your independent variables and your dependent variable. It looks at how multiple independent variables are related to a dependent variable.

I’m going to assume that you know a little bit about simple linear regression. If you don’t, check out this article on building a simple linear regressor. It will give you a quick (and fun) walk-through of the basics.

Simple linear regression is what you can use when you have one independent variable and one dependent variable. Multiple linear regression is what you can use when you have a bunch of different independent variables!

Multiple regression analysis has three main uses.

You can look at the strength of the effect of the independent variables on the dependent variable.
You can use it to ask how much the dependent variable will change if the independent variables are changed.
You can also use it to predict trends and future values.

Let’s do that one!

Image by RondellMelling via Pixabay

We’re going to keep things super simple here so that multiple linear regression as a whole makes sense. I do want you to know that things can get a lot more complex than this in the real world.

How do I begin?

For the purposes of this post, you are now working for a venture capitalist.

Congratulations!

So here’s the thing: you have a dataset in front of you with information on 50 companies. You have five columns that contain information about how much those companies spend on admin, research and development (R&D), and marketing, their location by state, and their profit for the most recent year. This dataset is anonymized, which means we don’t know the names of these companies or any other identifying information.

You’ve been hired to analyze this information and create a model. You need to inform the guy who hired you what kind of companies will make the most sense in the future to invest in. To keep things simple, let’s say that your employer wants to make this decision based on last year’s profit. This means that the profits column is your dependent variable. The other columns are the independent variables.

So you want to learn about the dependent variable (profit) based on the other categories of information you have.

The guy who hired you doesn’t want to invest in these specific companies. He wants to use the information in this dataset as a sample. This sample will help him understand which of the companies he looks at in the future will perform better based on the same information.

Does he want to invest in companies that spend a lot on R&D? Marketing? Does he want to invest in companies that are based in Illinois? You need to help him create a set of guidelines. You’re going to help him be able to say something along the lines of, “I’m interested in a company that’s based in New York that spends very little on admin expenses but a lot on R&D.”

You’re going to come up with a model that will allow him to assess where and into which companies he wants to invest to maximize his profit.

Linear regression is great for correlation, but remember that correlation and causation are not the same things! You are not saying that one thing causes the other, you’re finding which independent variables are strongly correlated to the dependent variable.

There are some assumptions that absolutely have to be true:

There is a linear relationship between the dependent variable and the independent variables.
The independent variables aren’t too highly correlated with each other.
Your observations for the dependent variable are selected independently and at random.
Regression residuals are normally distributed.

You need to check that these assumptions are true before you proceed and build your model. We’re totally skipping past that here. Make sure that if you’re doing this in the real world, you aren’t just blindly following this tutorial. Those assumptions need to be correct when you’re building your regression!

Dummy variables

If you aren’t familiar with the concept of dummy variables, check out this article on data cleaning and preprocessing. It has some simple code that we can go ahead and copy and paste here.

So we’ve already decided that “profit” is our dependent variable (y) and the others are our independent variables (X). We’ve also decided that what we want is a linear regression model. What about that column of states? “State” is a categorical variable, not a numerical variable. We need our independent variables to be numbers, not words. What do we do?

Photo by 3dman_eu via Pixabay

Let’s create a dummy variable!

If you looked at the information in the locations column, you might see that all of the companies that are being examined are based in two states. For the purposes of this explanation, let’s say all of our companies are located in either New York or Minnesota. That means that we’ll want to turn this one column of information into two columns of 1s and 0s. (If you want to learn more about why we’re doing that, check out that article on simple linear regression. It explains why this would be the best way to arrange our data.)

So how do we populate those columns? Basically, we’ll turn each state into its own column. If a company is located in New York, it will have a 1 in the “New York” column and a 0 in the “Minnesota” column. If you were using more states, you’d have a 1 in the New York column, and, for example, a 0 in the “California” column, a zero in the “Illinois” column, a 0 in the Arkansas column, and so on. We won’t be using the original “locations” column anymore because we won’t need it!

These 1s and 0s are basically working as a light switch. 1 is “on” or “yes” and 0 is “off” or “nope.”

Beware the dummy variable trap

You never want to include both variables at the same time.

Why is that?

You’d be duplicating a variable. The first variable (d1) is always equal to 1 minus the second variable (d2). (d1 = 1-d2) When one variable predicts another, it’s called multicollinearity. As a result, the model wouldn’t be able to distinguish the results of d1 from the results of d2. You can’t have the constant and both dummy variables at the same time. If you have nine variables, include eight of them. (If you have two sets of dummy variables, then you have to do this for each set.)

What is the P-value?

You’re going to want to be familiar with the concept of a P-value. That’s definitely going to come up.

The P-value is the probability of getting a sample like ours (or more extreme than ours) if the null hypothesis is true.

It gives a value to the weirdness of your sample. If you have a large P-value, then you probably won’t change your mind about the null hypothesis. A large value means that it wouldn’t be at all surprising to get a sample like yours if the hypothesis is true. As the P-value gets smaller, you should probably start to ask yourself some questions. You might want to change your mind and maybe even reject the hypothesis.

The null hypothesis is the official way to refer to the claim (hypothesis) that’s on trial here. It’s the default position where there’s just no association among the groups that are being tested. In every experiment, you’re looking for an effect among the groups that are being tested. Unfortunately, there’s always the possibility that there’s no effect (or no difference) between the groups. That lack of difference is called the null hypothesis.

It’s like if you were doing a trial of a drug that doesn’t work. In that trial, there just wouldn’t be a difference between the group that took the drug and the rest of the population. The difference would be null.

You always assume that the null hypothesis is true until you have evidence that it isn’t.

Let’s keep moving!

We need to figure out which columns we want to keep and which we want to toss. If you just chuck a bunch of stuff into your model, it won’t be a good one. It definitely won’t be reliable! (Also, at the end of the day, you need to be able to explain your model to the guy who hired you to create this thing. You’re only going to want to explain the variables that actually predict something!)

There are essentially five methods of building a multiple linear regression model.

Chuck Everything In and Hope for the Best
Backward Elimination
Forward Selection
Bidirectional Elimination
Score Comparison

You’ll almost certainly hear about Stepwise Regression as well. Stepwise regression is most commonly used as another way of saying bidirectional elimination (method 4). Sometimes when people use that phrase they’re referring to a combination of methods 2, 3, and 4. (That’s the idea behind bidirectional elimination as well.)

Method 1 (Chuck Everything In): Okay. That isn’t the official name for this method (but it should be). Occasionally you’ll need to build a model where you just throw in all your variables. You might have some kind of prior knowledge. You might have a particular framework you need to use. You might have been hired by someone who’s insisting that you do that. You might want to prepare for backward elimination. It’s a real option, so I’m including it here.

Method 2 (backward elimination): This has a few basic steps.

First, you’ll need to set a significance level for which data will stay in the model. For example, you might want to set a significance level of 5% (SL = 0.05). This is important and can have real ramifications, so give it some thought.
Next, you’ll fit the full model with all possible predictors.
You’ll consider the predictor with the highest P-value. If your P-value is greater than your significance level, you’ll move to step four, otherwise, you’re done!
Remove that predictor with the highest P-value.
Fit the model without that predictor variable. If you just remove the variable, you need to refit and rebuild the model. The coefficients and constants will be different. When you remove one, it affects the others.
Go back to step 3, do it all over, and keep doing that until you come to a point where even the highest P-value is < SL. Now your model is ready. All of the variables that are left are less than the significance level.

(After we go through these concepts, I’ll walk you through an example of backward elimination so you can see it in action! It’s definitely confusing, but if you really look at what’s going on, you’ll get the hang of it.)

Method 3 (forward selection): This is way more complex than just reversing backward elimination.

Choose your significance level (SL = 0.05).
Fit all possible simple regression models and select the one with the lowest P-value.
Keep this variable and fit all possible models with one extra predictor added to the one you already have. If we selected a simple linear regressor with one variable, now we’d select all of them with two variables. That means all possible two variable linear regressions.
Find the predictor with the lowest P-value. If P < Sl, go back to step 3. Otherwise, you’re done!

We can stop when PYou won’t keep the current model, though. You’ll keep the previous one because, in the final model, your variable is insignificant.

Method 3 (bidirectional elimination): This method combines the previous two!

Select a significance level to enter and a significance level to stay (SLENTER = 0.05, SLSTAY = 0.05).
Perform the next step of forward selection where you add the new variable. You need to have your P-value be less than SLENTER.
Now perform all of the steps of backward elimination. The variables must have a P-value less than SLSTAY in order to stay.
Now head back to step two, then move forward to step 3, and so on until no new variables can enter and no new variables can exit.

You’re done!

Method 4 (score comparison): Here, you’re going to be looking at all possible methods. You’ll look at a comparison of the scores for all of the possible methods. This is definitely the most resource-consuming approach!

Select a criterion of goodness of fit (for example, Akaike criterion)
Construct all possible regression models
Select the one with the best criterion

Fun fact: if you have 10 columns of data, you’ll wind up with 1,023 models here. You’d better be ready to commit if you’re going to go this route!

Ummm, what?

If you’re just getting started with machine learning, statistics, or data science, that all looks like it will be an insane amount of code. It’s not!

So much of what you need to do with a machine learning model is all ready to go with the amazing libraries out there. You’ll need to do the tough parts where you decide what information is important and what kind of models you’ll want to use. It’s also up to you to interpret the results and be able to communicate what you’ve built. However, the code itself is very doable.GIF via GIPHY

Let me show you!

Backward elimination is the fastest and the best method to start with, so that’s what I’m going to walk you through after we build the quick and easy multiple linear regression model.

First, let’s prepare our dataset. Let’s say we have a .csv file called “startups.csv” that contains the information we talked about earlier. We’ll say it has 50 companies and columns for R&D spending, admin spending, marketing spending, what state the company is located in (let’s say, New York, Minnesota, and California), and one column for last year’s profit.

It’s a good idea to import your libraries right away.

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Now we can go ahead and copy and paste the code from that data cleaning and preparation article! We’re definitely going to want to change the name of our dataset to ours. I’m calling it ‘startups.csv.’ We’ll adjust a couple of other tiny details as well. Profit (y) is still our last column, so we’ll continue to remove that with [:, :-1]. We’ll make a little adjustment to grab our independent variables with [:, 4]. Now we have a vector of the dependent variable (y) and a matrix of independent variables that contains everything except the profits (X). We want to see if there is a linear dependency between the two!

dataset = pd.read_csv('startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values

Now we need to encode the categorical variable. We can use label encoder and one hot encoder to create dummy variables. (We can copy and paste this from that other article too! Make sure you’re grabbing the right information and you don’t encode the dependent variable.) You’re going to change the index of the column in both spots [:, 3] and [:, 3] again, and replace the index in one hot encoder too [3].

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X[:, 3] = labelencoder.fit_transform(X[:, 3])
onehotencoder = OneHotEncoder(categorical_features = [3])
X = onehotencoder.fit_transform(X).toarray()

You’re ready to go! Our one column of information is now three columns, each of which corresponds to one state!

What about avoiding the dummy variable trap? You don’t actually need to do that with our libraries! It’s all taken care of for you here with the libraries that we’re choosing to use. However, if you ever want or need to run that code, it’s simple! You can do that with one line right after you encode your data.

X=X[:, 1:]

What does that do? It removes the first column from X. Putting the 1 there means that we want to take all of the columns starting at index 1 to the end. You won’t take the first column. For some libraries, you’ll need to take one column away manually to be sure your dataset won’t contain redundancies.

Now let’s split our training and testing data. The most common split is an 80/20 split, which means 80% of our data would go to training our model and 20% would go to testing it. Let’s do that here!

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

What about feature scaling?

We don’t need to do feature scaling here! The library will take care of that for us.

Photo by Gift Habeshaw on Unsplash

Multiple linear regression time!

We’ll import linear regression from Scikit-Learn. (That makes a little sense, doesn’t it?)

from sklearn.linear_model import LinearRegression

Now we’ll introduce our regressor. We’ll create an object of the class LinearRegression and we’ll fit the object to our training set. We want to apply this to both our X_train and y_train.

regressor = LinearRegression()
regressor.fit(X_train, y_train)

Now let’s test the performance of our multiple linear regressor!

(We won’t plot a graph here because we’d need five dimensions to do that. If you’re interested in plotting a graph with a simple linear regressor, check out this article on building a simple linear regressor.)

We’ll create the vector of predictions (y_pred). We can use the regressor with the predict method to predict the observations of the test set (X_test).

y_pred = regressor.predict(X_test)

That’s it! Four lines of code and you’ve built a multiple linear regressor!GIF via GIPHY

Now we can see the ten predicted profits! You can print them any time with a simple print(y_pred). We can easily compare them by taking a look at the predictions and then comparing them to the actual results. If you were to take a look, you’d see that some are incredibly accurate and the rest are pretty darn good. Nice work!

There is definitely some linear dependency between our dependent and independent variables. We can clearly see a strong linear relationship between the two.

Congratulations!! You now know how to make a multiple linear regressor in Python!

Want to keep going?

Things are about to get more challenging!

What if some of the variables have a lot of impact on our dependent variable and some are statistically insignificant? We can definitely find out which are the variables that have the highest impact on the dependent variable. We’ll want to find a team of variables that all have a definite effect, positive or negative.

Let’s use backward elimination!

We need to prepare something specific for backward elimination. We want a library stats model, so let’s import statsmodels.formula.api. That’s a little long to have to keep retyping, so we’ll make a shortcut using sm.

import statsmodels.formula.api as sm

We need to add a column of ones in our matrix of features of independent variables because of the way it works with the constant. (Our model needs to take into account our constant b0. In most libraries it’s included, but not in the stats model that we’re using. We’ll add a column of ones so our stats model will understand the formula correctly.)

This starts pretty simply. We’ll use .append because we want to append.

(Love Python ❤️)

We have our matrix of features X. The values argument is perfect for us because it’s an array. We’ll input a matrix of 50 lines and one column with 1s inside. We can create that with Numpy’s np.ones. We’ll need to specify the numbers of lines and columns we want (50,1). We need to convert the array into the integer type to make this work, so we’ll use .astype(int). Then we need to decide if we’re adding a line or a column (line = 0, column = 1), so we’ll say axis = 1 for a column!

We want this column to be located at the beginning of our dataset. What do we do? Let’s add matrix X to the column of 50 ones, rather than the other way around. We can do that with values = X.

X = np.append(arr = np.ones((50, 1)).astype(int), values = X, axis = 1)

Let’s do this!

We want to create a new matrix of our optimal features (X_opt). These features are the ones that are statistically significant. The ones that have a high impact on the profit. This will be the matrix containing the team of optimal features with high impact on the profit.

We’ll need to initialize it. We can remove the variables that are not statistically significant one by one. We’ll do this by removing the index at each step. First take all the indexes of the columns in X, separated by commas [0,1,2,3,4,5].

If you look back at the methods earlier, you’ll see that we first need to select our significance level, which we talked about earlier. Then we need to fit the model!

We aren’t going to take the regressor we built. We’re using a new library, so now we need a new fit to our future optimal matrix. We’ll create a new regressor (our last one was from the linear regression library). Our new class will be ordinary least squares (OLS). We’ll need to call the class and specify some arguments. (You can check out the official documentation here.) For our arguments, we’ll need an endog (our dependent variable) and an exog (our X_opt, which is just our matrix of features (X) with the intercept, which isn’t included by default). In order to fit it we’ll just use a .fit()!

X_opt = X[:, [0, 1, 2, 3, 4, 5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()

Now we’ve initialized X_opt!

Now let’s look at our P-values! How do we look for the predictor with the highest P-values? We’ll take our regressor object and call the function .summary().

regressor_OLS.summary()

Now we can see a table with some very useful information about our model! We can see the adjusted R-squared values and our P-values. The lower the p-value, the more significant your independent variable will be with respect to your dependent variable. Here, we’re looking for the highest one. That’s easy to see.

Now let’s remove it!

We can copy and paste our code from above and remove index 2. That will look like this:

X_opt = X[:, [0, 1, 3, 4, 5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
regressor_OLS.summary()

Just keep going until you don’t have any P-values that are higher than the SL value you chose. Remember that you always want to look at the original matrix in order to choose the correct index! You’re using the columns in your original matrix (X), not in X_opt.

You might get to the point where you have a P-value that’s incredibly close to the SL value that you chose. For example, we chose 0.050 and here’s 0.060.

GIF via GIPHY

That’s a tough situation because the value that you chose could have been anything. If you want to thoroughly follow your framework, you’ll need to remove that index. But there are other metrics that can help make more sense of whether or not we want to do that. We could add other metrics, like a criterion, that can help us decide if we really want to make that choice. There’s also a lot of information right in the summary here, like the R-squared value, that can help us make our decision.

So let’s say we ran backward elimination until the end and we’re left with only the index for the R&D spending column.

X_opt = X[:, [0, 1, 3, 4, 5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
regressor_OLS.summary()
X_opt = X[:, [0, 1, 3, 5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
regressor_OLS.summary()
X_opt = X[:, [0, 3, 5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
regressor_OLS.summary()
X_opt = X[:, [0, 3]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
regressor_OLS.summary()

If we’ve been following our model carefully, that means that we now know that R&D spending is a powerful predictor for our dependent variable! The conclusion here is that the data that can predict profits with the highest impact is composed of only one category: R&D spending!

You did it! You used multiple linear regression and backward elimination! You figured out that looking at R&D spending will give you the best sense of what a company’s profits will be!

You’re amazing!

As always, if you’re doing anything cool with this information, let people know about it in the responses below or reach out any time on LinkedIn @annebonnerdata!

The post Multiple Linear Regression in 4 Lines of Code! appeared first on Content Simplicity.

Simple linear regression in four lines of code

Anne B — Mon, 27 May 2019 21:14:31 +0000

A clear and comprehensive blueprint for absolutely anyone who wants to build a simple machine learning model

(This article first appeared in Towards Data Science)

Even you can build a machine learning model.

Seriously!

Good data alone doesn’t always tell the whole story. Are you trying to figure out what someone’s salary should be based on their years of experience? Do you need to examine how much you’re spending on advertising in relation to your yearly sales? Linear regression might be exactly what you need!

What is linear regression?

Linear regression looks at the relationship between the data you have and the data you want to predict.

Linear Regression is a basic and commonly used type of predictive analysis. It’s the most widely used of all statistical techniques. It quantifies the relationship between one or more predictor variables and one outcome variable.

Linear regression models are used to show (or predict) the relationship between two variables or factors. Regression analysis is commonly used to show the correlation between two variables.

You could, for example, look at some information about players on a baseball team and predict how well they might do that season. You might want to examine some variables about a company and predict well their stock might do. You might even just want to examine the number of hours people study and how well they do on a test, or you could look at student’s homework grades overall in relation to how well they might do on their tests. It’s a seriously useful technique!

Photo by StockSnap via Pixabay

Just remember: correlation is not causation! Just because a relationship exists between two variables doesn’t mean that one variable caused the other variable!Regression analysis is not used to predict cause-and-effect relationships. It can look at how variables relate to each other. It can examine to what extent variables are associated with each other. It’s up to you to take a closer look at those relationships.

A couple of important terms:

The variable that the equation in your linear regression model is predicting is called the dependent variable. We call that one y. The variables that are being used to predict the dependent variable are called the independent variables. We call them X.

You can think of it as though the prediction (y) is dependent on the other variables (X). That makes y the dependent variable!

In simple linear regression analysis, each observation consists of two variables. These are the independent variable and the dependent variable. Multiple regression analysis looks at two or more independent variables and how they correlate to the independent variable. The equation that describes how y is related to X is called the regression model!

Regression was first studied in depth by Sir Francis Galton, a man with a wide variety of interests. While he was a very problematic character with a lot of beliefs worth disagreeing with, he did write some books with cool information about things like treating spear wounds and getting your horse unstuck from quicksand. He also did some useful work with fingerprints, hearing tests, and even devised the first weather map. He was knighted in 1909.

While studying data on the relative sizes between parents and their children in plants and animals, he observed that larger-than-average parents have larger-than-average children, but those children will be less large in terms of their relative position within their own generation. He called it regression towards mediocrity. That would be regression to the mean in modern terms.

(I have to say, though, that there is a certain sparkle to the phrase, “regression towards mediocrity” that I need to work into my day-to-day life…)

To be clear, though, we’re talking about expectations (predictions) and not absolute certainty!

What good are regression models?

Regression models are used for predicting a real value, for example, salary or height. If your independent variable is time, then you are forecasting future values. Otherwise, your model is predicting present but unknown values. Examples of regression techniques include:

Simple regression
Multiple regression
Polynomial regression
Support Vector Regression

Let’s say you’re looking at some data that includes employee’s years of experience and salary. You want to look at the correlation between those two figures. Maybe you’re running a new business or small company that has been kind of setting the numbers randomly.

So how can you find the correlation between those two variables? In order to figure that out, we’ll create a model that will tell us what is the best fitting line for this relationship.

Intuition

Here’s a simple linear regression formula:

(You might recognize this as the equation for a slope or trend line from high school algebra.)

In this equation, y is the dependent variable, which is what you’re trying to explain. For the rest of this article, y will be an employee’s salary after a certain number of years of experience.

You can see the independent variable above. That’s the variable that is associated with the change in your predicted values. The independent variable might be causing the change or simply associated with the change. Remember, linear regression doesn’t prove causation!

The coefficient is how you explain that a change in your independent variable is maybe not totally equal to a change in y.

Now we want to look at the evidence. We want to put a line through our data that best fits our data. A regression line can show a positive linear relationship (the line looks like it’s sloping up), a negative linear relationship (the line is sloping down), or really no relationship at all (a flat line).

The constant is the point where the line crosses the vertical axis. For example, if you looked at 0 years of experience in the graph below, your salary would be around $30,000. So the constant in the chart below would be about $30,000.

The steeper the slope, the more money you get for your years of experience. For example, maybe with 1 more year of experience, your salary (y) goes up an additional $10,000, but with a steeper slope, you might wind up with more like $15,000. With a negative slope, you’d actually lose money as you gained experience, but I really hope you won’t be working for that company for long…

How does simple linear regression find that line?

When we look at a graph, we can draw vertical lines from the line to our actual observations. You can see the actual observations as the dots, while the line displays the model observations (the predictions).

The line that we drew is the difference between what an employee is actually earning and what he’s modeled (predicted) to be earning. We would look at the minimum sum of squares to find the best line, which just means that you’d take the sum of all the squared differences and find the minimum.

That’s called the ordinary least squares method!

So how do we do that?

First the imports!

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Now let’s preprocess our data! If you don’t know much about data cleaning and preprocessing, you might want to check out this article. It will walk you through importing libraries, preparing your data, and feature scaling.

We’re going to copy and paste the code from that article and make two tiny changes. We’ll need to change the name of our dataset, of course. Then we’ll take a look at the data. For our example, let’s say for our employees we have one column of years of experience and one column of salaries and that’s it. Keeping in mind that our index starts at 0, we will go ahead and separate the last column from our data for the dependent variable, just like we already have set up. This time, however, we’d be grabbing the second column for our independent variable, so we’d make a minor change to grab that.

dataset = pd.read_csv('salary.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 1].values

Now X is a matrix of features (our independent variable) and y is a vector of the dependent variable. Perfect!

It’s time to split our data into a training set and a test set. Normally, we would do an 80/20 split for our training and testing data. Here, though, we’re working with a small dataset of only 30 observations. Maybe this time we’ll split up our data so that we have 20 training observations and a test size of 10.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)

You have an X_train, X_test, y_train, and y_test! You’re ready to go! (Never forget that there are about a million things to learn about, change, and improve at every step of this process. The power of your model depends on you and everything that you put into it!)

Photo by Thomas William on Unsplash

We set a random state of 0 so that we can all get the same result. (There can be random factors in calculations, and I want to make sure we’re all on the same page so that nobody gets nervous.)

We’ll train our model on the training set and then later predict the results based on our information. Our model will learn the correlations on the training set. Then we will test what it learned by having it predict values with our test set. We can compare our results with the actual results on the test set to see how our model is doing!

Always split your data into training and testing sets! If you test your results on the same data you used to train it, you’ll probably have really great results, but your model isn’t good! It just memorized what you wanted it to do, rather than learning anything that it can use with unknown data. That’s called overfitting, and it means that you did not build a good model!

Feature scaling

We actually don’t need to do any feature scaling here!

Photo by Gift Habeshaw on Unsplash

Linear regression

Now we can fit the model to our training set!

We’ll use Scikit-learn learn for this. First, we’ll import the linear model library and the linear regression class. Then we’ll create an object of the class — the regressor. We’ll use a method (the fit method) to fit the regressor object that we create to the training set. To create the object, we name it, then call it using the parenthesis. We can do all of that in about three lines of code!

Let’s import linear regression from Scikit-Learn so that we can go ahead and use it. Between the parenthesis, we’ll specify which data we want to use so our model knows exactly what we want to fit. We want to grab both X_train and y_train because we’re working with all of our training data.

You can look at the documentation if you want more details!

Now we’re ready to create our regressor and fit it to our training data.

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

There it is! We’re using simple linear regression on our data and we’re ready to try out our predictive ability on our test set!

This is machine learning! We created a machine, the regressor, and we had it learn the correlation between years of experience and salary on the training set.

Now it can predict future data based on the information that it has. Our machine is ready to predict a new employee’s salary based on the number of years of experience that the employee has!

Let’s use our regressor to predict new observations. We want to see how the machine has learned by looking at what it does with new observations.

We’ll create a vector of predicted values. This is a vector of predictions of dependent variables that we’ll call y_pred. To do this, we can take the regressor we created and trained and use the predict method. We need to specify which predictions to make, so we want to make sure we include the test set. For our input parameter in regressor.predict, we want to specify the matrix of features of new observations, so we’ll specify X_test.

y_pred = regressor.predict(X_test)

Seriously. That takes a single line of code!

Now y_test are the real salaries of the 10 observations in the test set and y_pred are the predicted salaries of these 10 employees predicted by our model.

You did it! Linear regression in four lines of code!

Visualization

Let’s visualize the results! We need to see what the difference is between our predictions and the actual results.

We can plot the graphs in order to interpret the result. First, we can plot the real observations using plt.scatter to make a scatter plot. (We imported matplotlib.pyplot earlier as plt).

We’ll look at the training set first, so we’ll plot X_train on the X coordinates and y_train on y coordinates. Then we probably want some color. We’ll do our observations in blue, and our regression line (predictions) in red. For the regression line we’ll use X_train again for the X coordinates, and then the predictions of the X_train observations.

Let’s also fancy it up a little with a title and labels for the x-axis and y-axis.

plt.scatter(X_train, y_train, color = 'blue')
plt.plot(X_train, regressor.predict(X_train), color = 'red')
plt.title('Salary vs Experience (Training set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()

Now we can see our blue points, which are our real values and our predicted values along the red line!

Let’s do the same for the test set! We’ll change the test set title and change our “train” to “test” in the code.

plt.scatter(X_test, y_test, color = 'blue')
plt.plot(X_train, regressor.predict(X_train), color = 'red')
plt.title('Salary vs Experience (Test set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()

Make sure you notice that we aren’t changing X_train to X_test in the second line. Our regressor is already trained by the training set. When we trained, we obtained one unique model equation. If we replace it, we’ll obtain the same line and we’ll probably build new points of the same regression line.

This is a pretty good model!

Our model is doing a nice job of predicting these new employee salaries. Some of the actual observations are the same as the predictions, which is great. There isn’t a 100% dependency between the y and X variables, so some of the predictions won’t be completely accurate.

You did it! You imported libraries, cleaned and preprocessed data, built and trained a simple linear regressor, used it to make predictions, and you even visualized the results!

Congratulations!!!

Photo by Free-Photos via Pixabay

Want more?

Multiple linear regression is up next!

Keep learning!

Machine learning is built on statistics and you can’t begin to understand machine learning without concepts like the simple linear regressor. But that doesn’t mean that statistics and machine learning are the same things! A linear regressor is very much a tool of statistics (and data science), in addition to being a part of the basic building blocks of machine learning.

As always, if you’re doing anything cool with this information, let people know about it in the comments below or reach out any time on LinkedIn @annebonnerdata!

The post Simple linear regression in four lines of code appeared first on Content Simplicity.

How to Create a Totally Free Portfolio or Website

Anne B — Sun, 26 May 2019 21:03:29 +0000

Getting started with GitHub Pages: the unbelievably quick and easy guide for creating and publishing a free portfolio, blog, or website

(This article first appeared on Towards Data Science)

GitHub Pages has to be the coolest tool that people don’t know that they already have. Pretty much any repository on GitHub can be turned into a website with the click of two buttons. It’s the simplest way to build and host a totally free portfolio, website, or blog.

Do you need an online portfolio of your work for potential employers to check out but you don’t know how to make a website? Do you want to create a free portfolio, blog, or a business site but you don’t know where to start? Is it possible that you just don’t want to deal with (or pay for) website hosting, domain names, and everything else?

This one’s for you!

Photo by Leonard Alcira on Unsplash

Why should I have a website?

It’s hard to imagine anyone who wouldn’t benefit from having a website! You might need to display your portfolio for potential clients or employers. You might need to organize your projects in a way that you can share. You may want to create a blog about the things you’re doing or the places you’ve been. You might need to advertise yourself or your business or sell a product. Whatever your reason, there’s a good chance that you want to put something together without spending a ton of time on it. There’s an even better chance that you don’t want to spend a lot of money.

Photo by imarksm via Pixabay

A website is a way to connect with the world and it’s a powerful tool for communication. It’s a way to share your work, your interests, and your passions. It’s how you can create, build, and control your online image. Plus, the sooner you build your site, the more time you’ll have to build your online presence and reach the people that you want to reach. It can help you stand out in a sea of competitors.

It’s also not the easiest thing to create if you don’t know what you’re doing!

There are a few ways that a beginner can create a simple and completely free portfolio or website. The main ones are GitHub and WordPress.

WordPress is great for beginners who need all the help. I love WordPress! That’s how I got started with my very first blog! The thing about a free WordPress site is that it’s really obvious that it’s a WordPress site. You have an address that ends in wordpress.com and a WordPress logo at the bottom of every page.

If you’re getting started in the tech world, you’re going to find that you look more appealing if you know how to use GitHub. If you’ve been in the tech world for a little while now, there’s a very good chance that you already have somewhere between one and one million repositories on GitHub right now.

Why not build your website on GitHub and host it right from your repository?

So much of what matters in the tech world right now is contributing to open source. Sharing your work openly in the community is a big deal. GitHub is designed for exactly this. Putting your work on GitHub shows that you’re involved and aware. (I host my portfolio right from a repository on GitHub if you want to take a look. It’s pretty out-of-date, but it is an example of a profile site built with Bootstrap and hosted from a GitHub repo.)

When you share your projects on GitHub, people can see your code, what you’re doing, and how you’re doing it. GitHub is all about the communication of ideas.

Pretty much everyone in tech right now is using Git and/or GitHub in some way. Having your profile right there on GitHub is a great way to hold up your hand and get involved. Plus, you’ll wind up with a repository and some commits on your profile page!

If you’re brand new to everything Git, you might want to check out “Getting Started with Git and GitHub: the complete beginner’s guide.” That article will walk you through the basics of what Git and GitHub are, concepts like “repositories,” and a ton more. I’m going to assume that you already know the basics. If you don’t, it’s worth taking a few minutes to get acquainted with them.

Let’s get this party started!

There are two ways of getting started with your free portfolio or website. You might be starting completely from scratch! On the other hand, you might have a website that you’ve already put together, but you don’t know how to use GitHub to turn it into a free website.

I’ll start with option 2.

I have the files, but I don’t know what to do with them!

This couldn’t be easier. Seriously! GitHub does the rest of the work for you. I’m assuming that you already have a GitHub account and that you know what a repository is, but if you don’t, check out that getting started with Git and GitHub article.

In a nutshell, a repository is where your project will live. It’s where you’ll organize your project. You’ll keep folders, files, images, videos, spreadsheets, Jupyter notebooks, data sets, and anything else your project needs in your repository.

If you haven’t already, go ahead and initialize your project with a repository, or create a new repository and upload your files. If you have a file called “index.html” GitHub will already understand what you want to do.

Now you’re going to take advantage of GitHub Pages. Go to your GitHub repository and click “Settings.”

Scroll down to “GitHub pages.” You’ll see this:

Now drop the “Source” dropdown menu to either “master branch” or “master branch/docs folder.” Here’s the thing: if you want to publish from your “docs” folder, you seriously need to have a “docs” folder in your master branch from which you want to run your website!

Chances are, if you’re a beginner, you’ll choose “master branch,” which just means that you want to publish your repository pretty much as-is. (There have been a couple of times where I needed to tweak a file path or two, depending on how I had my folders structured.)

You’re going to see a notification that your site is ready to be published.

Be patient, wait a minute or two, and then refresh the page or try the link if you want. Once your site has been published, you’ll see this:

Try clicking on that link.

Poof! You have a free website! This could just as easily be a free portfolio or blog!

Congratulations!!!

Now for the other option:

I don’t even know how to get started!

I’m going to tackle things like Bootstrap and basic website design another time and focus on the absolute basics here. I do want you to know, though, that the world is your oyster! The only thing limiting your options here is your drive to make it happen. (Well, maybe drive and also the amount of time you have available…) Because this option is for the complete beginner, I’m going to show you how to do everything right on the GitHub website.

We’ll go ahead and create a new repository first.

Fill in your repository name, a short description, check the box that says “Initialize this repository with a README,” and then click “Create repository.”

Now go to “Settings” near the top right-hand side of the screen and then scroll down to the “GitHub Pages” section. Drop the dropdown menu that says “None” to “master branch.”

First, you’ll see this:

Wait a minute or two, and then you’ll see this:

Now click the link!

You have a website! Congratulations!

That doesn’t look like much

Okay, that does look pretty boring, but you can see here that what’s displaying is your README file.

If you want to make some quick changes, you can go ahead and edit your README to display what you want people to see. To do that, go back into your repository, click the little pencil icon on your README file, and make it better!

Edit your README file

Editing the file (you’re working with a Markdown file)

How the file looks with a few edits!

You’re using Markdown, and there are a lot of things you can do with markdown files. This includes adding text, images, links, colors, and some basic formatting. It’s a seriously simple way to start! Here’s the Markdown Guide to basic syntax for anyone who hasn’t worked with it before.

(Remember that if you add any images to your README, you want to make sure to upload them to your repository, or GitHub won’t know what you want!)

Now go back to your website and see what you have!

Be aware that it sometimes takes a few minutes for the changes to go through. If you don’t see your changes immediately, wait a few minutes and try again. I’ve also had an issue where my laptop wanted to keep going back to an older version of my website even though I’d made changes. Deleting my browser history for the last 24 hours fixed that problem. Try the easy fixes before you freak out about the complicated stuff!

That’s an improvement, but it could be more interesting

If you’re a total beginner and you don’t know anything about CSS, but you want a little more visual appeal, try a Jekyll theme! They’re prebuilt themes that you can use to make your site look a little better with basically no effort on your end. Jekyll and GitHub will do the work for you! Your job is to push a button or two.

Go back to the “GitHub Pages” section in “Settings” and click on “Choose a theme.”

Let’s see what our website looks like if we choose the first theme that shows up. All you have to do is press the green “Select theme” button, give it a couple of minutes, and then try your website again!

And with just a few minutes of effort, we’re already getting somewhere!

That’s it! In just a few minutes, you created your own free website for your business, blog, or even your very own free portfolio site, hosted it through a GitHub repository, and it’s already up and running. You’re ready to share with the world!

Way to go!!!

Just a couple of notes:

* If you decide that you don’t want to use a theme after all, there’s no button to go back to the original version. It’s actually totally easy to get rid of your theme, though! If you go back to your repository, you’ll discover that you now have a file called “_config.yml” which contains your theme information. If you delete that file, you delete the theme!

* If you want to play around with your theme and theme options, you’ll find that the “_config.yml” file is your first stop. Now that you know that, take a look at the zillions of other Jekyll options that you have! You can even start with the Jekyll Now theme if you want a simple and already set-up blog. Your options are endless!

I can’t wait to see what you create! As always, if you make anything amazing with this information, let everyone know about it in the comments below or reach out any time on Twitter @annebonnerdata. Feel free to share your free portfolios and blogs here for everyone to see!

Thanks for reading!

The post How to Create a Totally Free Portfolio or Website appeared first on Content Simplicity.

Intro to Deep Learning

Anne B — Fri, 12 Apr 2019 15:55:19 +0000

Photo by ibjennyjenny on Pixabay

What’s going on behind a deep learning algorithm?

(This article first appeared in Towards Data Science)

We live in a world where, for better and for worse, we are constantly surrounded by deep learning algorithms. From social network filtering to driverless cars to movie recommendations, and from financial fraud detection to drug discovery to medical image processing (…is that bump cancer?), the field of deep learning influences our lives and our decisions every single day.

In fact, you’re probably reading this article right now because a deep learning algorithm thinks you should see it.

Photo by tookapic on Pixabay

If you’re looking for the basics of deep learning, artificial neural networks, convolutional neural networks, (neural networks in general…), backpropagation, gradient descent, and more, you’ve come to the right place. In this series of articles, I’m going to explain these concepts as simply and comprehensibly as I can.

There will also be cats.

Learning is so much easier when it’s sprinkled with a little silly.

Photo by skeeze on Pixabay

If you get into deep learning, there’s an incredible amount of really in-depth information out there. I’ll make sure to provide additional resources along the way for anyone who wants to swim a little deeper into these waters. (For example, you might want to check out Efficient BackProp by Yann LeCun, et al., which is written by one of the most important figures in deep learning. This paper looks specifically at backpropagation, but also discusses some of the most important topics in deep learning, like gradient descent, stochastic learning, batch learning, and so on. It’s all here if you want to take a look!)

For now, let’s jump right in!

Photo by Laurine Bailly on Unsplash

What is deep learning?

Really, it’s just learning from examples. That’s pretty much the deal.

At a very basic level, deep learning is a machine learning technique that teaches a computer to filter inputs (observations in the form of images, text, or sound) through layers in order to learn how to predict and classify information.

Deep learning algorithms are inspired by the way that the human brain filters information!

Photo by Christopher Campbell on Unsplash

Essentially, deep learning is a part of the machine learning family that’s based on learning data representations (rather than task-specific algorithms). Deep learning is actually closely related to a class of theories about brain development proposed by cognitive neuroscientists in the early ’90s. Just like in the brain (or, more accurately, in the theories and model put together by researchers in the 90s regarding the development of the human neocortex), neural networks use a hierarchy of layered filters in which each layer learns from the previous layer and then passes its output to the next layer.

Deep learning attempts to mimic the activity in layers of neurons in the neocortex.

In the human brain, there are about 100 billion neurons and each neuron is connected to about 100,000 of its neighbors. Essentially, that is what we’re trying to create, but in a way and at a level that works for machines.

Photo by GDJ on Pixabay

The purpose of deep learning is to mimic how the human brain works in order to create some real magic.

What does this mean in terms of neurons, axons, dendrites, and so on? Well, the neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and is transferred to the dendrites of the next neuron. That connection (not an actual physical connection, but a connection nonetheless) where the signal is passed is called a synapse.

Photo by mohamed_hassan on Pixabay

Neurons by themselves are kind of useless, but when you have lots of them, they work together to create some serious magic. That’s the idea behind a deep learning algorithm! You get input from observation, you put your input into one layer that creates an output which in turn becomes the input for the next layer, and so on. This happens over and over until your final output signal!

So the neuron (or node) gets a signal or signals (input values), which pass through the neuron, and that delivers the output signal. Think of the input layer as your senses: the things you see, smell, feel, etc. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. (You will need to either standardize or normalize these variables so that they’re within the same range.)

What can our output value be? It can be continuous (for example, price),binary (yes or no), or categorical (cat, dog, moose, hedgehog, sloth, etc.). If it’s categorical you want to remember your output value won’t be just one variable, but several output variables.

Photo by Hanna Listek on Unsplash

Also, keep in mind that your output value will always be related to the same single observation from the input values. If your input values were, for example, an observation of the age, salary, and vehicle of one person, your output value would also relate to the same observation of the same person. This sounds pretty basic, but it’s important to keep in mind.

What about synapses? Each of the synapses gets assigned weights, which are crucial to Artificial Neural Networks (ANNs). Weights are how ANNs learn. By adjusting the weights, the ANN decides to what extent signals get passed along. When you’re training your network, you’re deciding how the weights are adjusted.

What happens inside the neuron? First, all of the values that it’s getting are added up (the weighted sum is calculated). Next, it applies an activation function, which is a function that’s applied to this particular neuron. From that, the neuron understands if it needs to pass along a signal or not.

This is repeated thousands or even hundreds of thousands of times in a deep learning algorithm!

Photo by Geralt on Pixabay

We create an artificial neural net where we have nodes for input values (what we already know/what we want to predict) and output values (our predictions) and in between those, we have a hidden layer (or layers) where the information travels before it hits the output. This is analogous to the way that the information you see through your eyes is filtered into your understanding, rather than being shot straight into your brain.

Image by Geralt on Pixabay

Deep learning models can be supervised, semi-supervised, and unsupervised.

Say what?

Supervised learning

Are you into psychology? This is essentially the machine version of “concept learning.” You know what a concept is (for example an object, idea, event, etc.) based on the belief that each object/idea/event has common features.

The idea here is that you can be shown a set of example objects with their labels and learn to classify objects based on what you have already been shown. You simplify what you’ve learned from what you’ve been shown, condense it in the form of an example, and then you take that simplified version and apply it to future examples. We really just call this “learning from examples.”

Photo by Gaelle Marcel on Unsplash

(Dress that baby up a little and it looks like this: concept learning refers to the process of inferring a Boolean-valued function from training examples of its input and output.)

In a nutshell, supervised machine learning is the task of learning a function that maps an input to an output based on example input-output pairs. It works with labeled training data made up of training examples. Each example is a pair that’s made up of an input object (usually a vector) and the output value that you want (also called the supervisory signal). Your algorithm supervises the training data and produces an inferred function which can be used to map new examples. Ideally, the algorithm will allow you to classify examples that it hasn’t seen before.

Basically, it looks at stuff with labels and uses what it learns from the labeled stuff to predict the labels of other stuff.

Classification tasks tend to depend on supervised learning. These tasks might include

Detecting faces, identities, and facial expressions in images
Identifying objects in images like stop signs, pedestrians, and lane markers
Classifying text as spam
Recognizing gestures in videos
Detecting voices and identifying sentiment in audio recordings
Identifying speakers
Transcribing speech-to-text

Semi-supervised learning

This one is more like the way you learned from the combination of what your parents explicitly told you as a child (labeled information) combined with what you learned on your own that didn’t have labels, like the flowers and trees that you observed without naming or counting them.

Photo by Robert Collins on Unsplash

Semi-supervised learning does the same kind of thing as supervised learning, but it’s able to make use of both labeled and unlabeled data for training. In semi-supervised learning, you’re often looking at a lot of unlabeled data and a little bit of labeled data. There are a number of researchers out there who have found that this process can provide more accuracy than unsupervised learning but without the time and costs associated with labeled data. (Sometimes labeling data requires a skilled human being to do things like transcribe audio files or analyze 3D images in order to create labels, which can make creating a fully labeled data set pretty unfeasible, especially when you’re working with those massive data sets that deep learning tasks love.)

Semi-supervised learning can be referred to as transductive (inferring correct labels for the given data) or inductive (inferring the correct mapping from X to Y).

In order to do this, deep learning algorithms have to make at least one of the following assumptions:

Points that are close to each other probably share a label (continuity assumption)
The data like to form clusters and the points that are clustered together probably share a label (cluster assumption)
The data lie on a manifold of lower dimension than the input space (manifold assumption). Okay, that’s complicated, but think of it as if you were trying to analyze someone talking — you’d probably want to look at her facial muscles moving her face and her vocal cords making sound and stick to that area, rather than looking in the space of all images and/or all acoustic waves.

Unsupervised learning (aka Hebbian Learning)

Unsupervised learning involves learning the relationships between elements in a data set and classifying the data without the help of labels. There are a lot of algorithmic forms that this can take, but they all have the same goal of mimicking human logic by searching for hidden structures, features, and patterns in order to analyze new data. These algorithms can include clustering, anomaly detection, neural networks, and more.

Clustering is essentially the detection of similarities or anomalies within a data set and is a good example of an unsupervised learning task. Clustering can produce highly accurate search results by comparing documents, images, or sounds for similarities and anomalies. Being able to go through a huge amount of data to cluster “ducks” or the perhaps the sound of a voice has many, many potential applications. Being able to detect anomalies and unusual behavior accurately can be extremely beneficial for applications like security and fraud detection.

Photo by Andrew Wulf on Unsplash

Back to it!

Deep learning algorithms and architectures have been applied to social network filtering, image recognition, financial fraud detection, speech recognition, computer vision, medical image processing, natural language processing, visual art processing, drug discovery and design, toxicology, bioinformatics, customer relationship management, audio recognition, and many, many other fields and concepts. Deep learning models are everywhere!

There are, of course, a number of deep learning techniques that exist, like convolutional neural networks, recurrent neural networks, and so on. No one network is better than the others, but some are definitely better suited to specific tasks.

Deep Learning and Artificial Neural Networks

The majority of modern deep learning architectures are based on Artificial Neural Networks (ANNs) and use multiple layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output of the previous layer for its input. What they learn forms a hierarchy of concepts where each level learns to transform its input data into a slightly more abstract and composite representation.

Image by ahmedgad on Pixabay

That means that for an image, for example, the input might be a matrix of pixels, then the first layer might encode the edges and compose the pixels, then the next layer might compose an arrangement of edges, then the next layer might encode a nose and eyes, then the next layer might recognize that the image contains a face, and so on. While you may need to do a little fine tuning, the deep learning process learns which features to place in which level on its own!

Photo by Cristian Newman on Unsplash

The “deep” in deep learning just refers to the number of layers through which the data is transformed (they have a substantial credit assignment path (CAP), which is the chain of transformations from input to output). For a feedforward neural network, the depth of the CAPs is that of the network and the number of hidden layers plus one (the output layer). For a recurrent neural network, a signal might propagate through a layer more than once, so the CAP depth is potentially unlimited! Most researchers agree that deep learning involves CAP depth >2.

Convolutional Neural Networks

One of the most popular types of neural networks is convolutional neural networks (CNNs). The CNN convolves (not convolutes…) learned features with input data and uses 2D convolutional layers, which means that this type of network is ideal for processing (2D) images. The CNN works by extracting features from images, meaning that the need for manual feature extraction is eliminated. The features are not trained! They’re learned while the network trains on a set of images, which makes deep learning models extremely accurate for computer vision tasks. CNNs learn feature detection through tens or hundreds of hidden layers, with each layer increasing the complexity of the learned features.

If you want to keep going with me, we tackle CNNs in depth in part 3!

(Want to learn more? Check out Introduction to Convolutional Neural Networks by Jianxin Wu and Yann LeCun’s original article, Gradient-Based Learning Applied to Document Recognition.)

Recurrent neural networks

While convolutional neural networks are typically used for processing images, recurrent neural networks (RNNs) are used for processing language. RNNs don’t just filter information from one layer into the next, they have built-in feedback loops where the output from one layer might be fed back into the layer preceding it. This actually lends the network a sort of memory.

Generative adversarial networks

In generative adversarial networks (GANs), two neural networks fight it out. The generator network tries to create convincing “fake” data while the discriminator tries to tell the difference between the fake data and the real stuff. With each training cycle, the generator gets better at creating fake data and the discriminator gets sharper at spotting the fakes. By pitting the two against each other during training, both networks improve. (Basically, shirts vs. skins here. The home team is playing itself to improve its game.) GANs can be used for extremely interesting applications, including generating images from written text. GANs can be tough to work with, but more robust models are constantly being developed.

Deep Learning in the Future

The future is full of potential for anyone interested in deep learning. The most remarkable thing about a neural network is its ability to deal with vast amounts of disparate data. That becomes more and more relevant now that we’re living in an era of advanced smart sensors which can gather an unbelievable amount of data every second of every day. It’s estimated that we are currently generating 2.6 quintillion bytes of data every single day. This is an enormous amount of data. While traditional computers have trouble dealing with and drawing conclusions from so much data, deep learning actually becomes more efficient as the amount of data grows larger. Neural nets are capable of discovering latent structures within vast amounts of unstructured data, like raw media for example, which are the majority of data in the world.

The possibilities are endless!

Still with me? Check out part 2 where we take a deeper look at artificial neural networks. Then head on over to part 3 where we tackle image classification and convolutional neural networks!

Want to see how to build a deep learning model from the ground up? Check out this article that tells you exactly how to build an image classifier with PyTorch that has greater than 97% accuracy!

Need some free GPU, but not sure where to find it? Check out Getting Started with Google Colab.

Have you already finished a machine learning model, but you don’t know what to do with it next?

Why not deploy it to the internet?

Check out this article to learn how to deploy your machine learning model with Flask!

Thank you for reading!

The post Intro to Deep Learning appeared first on Content Simplicity.

How to build an image classifier with greater than 97% accuracy

Anne B — Wed, 10 Apr 2019 08:34:33 +0000

(This article first appeared in Free Code Camp)

Image classifiers are amazing.

How do you teach a computer to look at an image and correctly identify it as a flower? How do you teach a computer to see an image of a flower and then tell you exactly what species of flower it is when even you don’t know what species it is?

Let me show you!

This article will take you through the basics of creating an image classifier with PyTorch. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. You could, if you wanted, train this classifier and then export it for use in an application of your own.

What you do from here depends entirely on you and your imagination.

I put this article together for anyone out there who’s brand new to all of this and looking for a place to begin. It’s up to you to take this information, improve on it, and make it your own! Build an even better image classifier if you want to!

If you want to view the notebook, you can find it here.

Because this PyTorch image classifier was built as a final project for a Udacity program, the code draws on code from Udacity which, in turn, draws on the official PyTorch documentation. Udacity also provided a JSON file for label mapping. That file can be found in this GitHub repo.

Information about the flower data set can be found here. The data set includes a separate folder for each of the 102 flower classes. Each flower is labeled as a number and each of the numbered directories holds a number of .jpg files.

Let’s make an image classifier!

Photo by Annie Spratt on Unsplash

Because this is a neural network using a larger dataset than my CPU could handle in any reasonable amount of time, I went ahead and set up my image classifier in Google Colab. Colab is truly awesome because it provides free GPU. (If you’re new to Colab, check out this article on getting started with Google Colab!)

Because I was using Colab, I needed to start by importing PyTorch. You don’t need to do this if you aren’t using Colab.

*** UPDATE! (01/29)*** Colab now supports native PyTorch!!! You shouldn’t need to run the code below, but I’m leaving it up just in case anyone is having any issues!

Then, after having some trouble with Pillow (it’s buggy in Colab!), I just went ahead and ran this:

import PIL
print(PIL.PILLOW_VERSION)

If you get anything below 5.3.0, use the dropdown menu under “Runtime” to “Restart runtime” and run this cell again. You should be good to go!

You’ll want to be using GPU for this project, which is incredibly simple to set up on Colab. You just go to the “runtime” dropdown menu, select “change runtime type” and then select “GPU” in the hardware accelerator drop-down menu!

Then I like to run

train_on_gpu = torch.cuda.is_available()

if not train_on_gpu:
    print('Bummer!  Training on CPU ...')
else:
    print('You are good to go!  Training on GPU ...')

just to make sure it’s working. Then run

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

to define the device.

After this, import the files. There are a ton of ways to do this, including mounting your Google Drive if you have your dataset stored there, which is actually really simple. Even though I didn’t wind up finding that to be the most useful solution, I’m including that below, just because it’s so easy and useful.

from google.colab import drive
drive.mount('/content/gdrive')

Then you’ll see a link, click on that, allow access, copy the code that pops up, paste it in the box, hit enter, and you’re good to go! If you don’t see your drive in the side box on the left, just hit “refresh” and it should show up.

(Run the cell, click the link, copy the code on the page, paste it in the box, hit enter, and you’ll see this when you’ve successfully mounted your drive):

It’s actually super easy!

However, if you’d rather download a shared zip file link (this wound up being easier and faster for this project), you can use:

!wget 
!unzip

For example:

!wget -cq https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip
!unzip -qq flower_data.zip

That will give you Udacity’s flower data set in seconds!

(If you’re uploading small files, you can just upload them directly with some simple code. However, if you want to, you can also just go to the left side of the screen and click “upload files” if you don’t feel like running some simple code to grab a local file.)

After loading the data, I imported the libraries I wanted to use for this image classifier:

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import time
import json
import copy

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import PIL

from PIL import Image
from collections import OrderedDict

import torch
from torch import nn, optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
import torchvision
from torchvision import datasets, models, transforms
from torch.utils.data.sampler import SubsetRandomSampler
import torch.nn as nn
import torch.nn.functional as F

Next come the data transformations! You want to make sure to use several different types of transformations on your training set in order to help your program learn as much as it can. You can create a more robust model by training it on flipped, rotated, and cropped images.

The means that standard deviations are provided to normalize the image values before passing them to our network, but they can also be found by looking at the mean and standard deviation values of the different dimensions of the image tensors. The official documentation is incredibly helpful here!

For my image classifier, I kept it simple with:

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(30),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ]),
    'valid': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ])
}

# Load the datasets with ImageFolder
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'valid']}

# Using the image datasets and the trainforms, define the dataloaders
batch_size = 64
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
                                             shuffle=True, num_workers=4)
              for x in ['train', 'valid']}

class_names = image_datasets['train'].classes

dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid']}
class_names = image_datasets['train'].classes

As you can see above, I also defined the batch size, data loaders, and class names in the code above.

To take a very quick look at the data and check my device, I ran:

print(dataset_sizes)
print(device)

{'train': 6552, 'valid': 818}
cuda:0

Next, we need to do some mapping from the label number and the actual flower name. Udacity provided a JSON file for this mapping to be done simply.

with open('cat_to_name.json', 'r') as f:
    cat_to_name = json.load(f)

In order to test the data loader, run:

images, labels = next(iter(dataloaders['train']))
rand_idx = np.random.randint(len(images))
# Print(rand_idx)
print("label: {}, class: {}, name: {}".format(labels[rand_idx].item(),
                                               class_names[labels[rand_idx].item()],
                                               cat_to_name[class_names[labels[rand_idx].item()]]))

Now it starts to get even more exciting! A number of models in the last several years have been created by people far, far more qualified than most of us for reuse in computer vision problems. PyTorch makes it easy to load pre-trained models and build on them, which is exactly what we’re going to do for this project. The choice of model is entirely up to you!

Some of the most popular pre-trained models that work well for image classifiers, like ResNet, AlexNet, and VGG, come from the ImageNet Challenge. These pre-trained models allow others to quickly obtain cutting-edge results in computer vision without needing such large amounts of computer power, patience, and time. I actually had great results with DenseNet and decided to use DenseNet161, which gave me very good results relatively quickly.

You can quickly set this up by running

model = models.densenet161(pretrained=True)

but it might be more interesting to give yourself a choice of model, optimizer, and scheduler. In order to set up a choice in architecture, run

model_name = 'densenet' #vgg
if model_name == 'densenet':
    model = models.densenet161(pretrained=True)
    num_in_features = 2208
    print(model)
elif model_name == 'vgg':
    model = models.vgg19(pretrained=True)
    num_in_features = 25088
    print(model.classifier)
else:
    print("Unknown model, please choose 'densenet' or 'vgg'")

which allows you to quickly set up an alternate model.

After that, you can start to build your classifier, using the parameters that work best for you. I went ahead and built

for param in model.parameters():
    param.requires_grad = False

def build_classifier(num_in_features, hidden_layers, num_out_features):
   
    classifier = nn.Sequential()
    if hidden_layers == None:
        classifier.add_module('fc0', nn.Linear(num_in_features, 102))
    else:
        layer_sizes = zip(hidden_layers[:-1], hidden_layers[1:])
        classifier.add_module('fc0', nn.Linear(num_in_features, hidden_layers[0]))
        classifier.add_module('relu0', nn.ReLU())
        classifier.add_module('drop0', nn.Dropout(.6))
        classifier.add_module('relu1', nn.ReLU())
        classifier.add_module('drop1', nn.Dropout(.5))
        for i, (h1, h2) in enumerate(layer_sizes):
            classifier.add_module('fc'+str(i+1), nn.Linear(h1, h2))
            classifier.add_module('relu'+str(i+1), nn.ReLU())
            classifier.add_module('drop'+str(i+1), nn.Dropout(.5))
        classifier.add_module('output', nn.Linear(hidden_layers[-1], num_out_features))
        
    return classifier

which allows for an easy way to change the number of hidden layers that I’m using, as well as quickly adjusting the dropout rate. You may decide to add additional ReLU and dropout layers in order to more finely hone your model.

Next, work on training your classifier parameters. I decided to make sure I only trained the classifier parameters here while having feature parameters frozen. You can get as creative as you want with your optimizer, criterion, and scheduler. The criterion is the method used to evaluate the model fit, the optimizer is the optimization method used to update the weights, and the scheduler provides different methods for adjusting the learning rate and step size used during optimization.

Try as many options and combinations as you can to see what gives you the best result. You can see all of the official documentation here. I recommend taking a look at it and making your own decisions about what you want to use. You don’t literally have an infinite number of options here, but it sure feels like it once you start playing around!

hidden_layers = None

classifier = build_classifier(num_in_features, hidden_layers, 102)
print(classifier)

# Only train the classifier parameters, feature parameters are frozen
if model_name == 'densenet':
    model.classifier = classifier
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adadelta(model.parameters())
    sched = optim.lr_scheduler.StepLR(optimizer, step_size=4)
elif model_name == 'vgg':
    model.classifier = classifier
    criterion = nn.NLLLoss()
    optimizer = optim.Adam(model.classifier.parameters(), lr=0.0001)
    sched = lr_scheduler.StepLR(optimizer, step_size=4, gamma=0.1)
else:
    pass

Now it’s time to train your model.

# Adapted from https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

def train_model(model, criterion, optimizer, sched, num_epochs=5):
    since = time.time()

best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch+1, num_epochs))
        print('-' * 10)

# Each epoch has a training and validation phase
        for phase in ['train', 'valid']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

running_loss = 0.0
            running_corrects = 0

# Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

# Zero the parameter gradients
                optimizer.zero_grad()

# Forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

# Backward + optimize only if in training phase
                    if phase == 'train':
                        #sched.step()
                        loss.backward()
                        
                        optimizer.step()

# Statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

# Deep copy the model
            if phase == 'valid' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

print()

time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

# Load best model weights
    model.load_state_dict(best_model_wts)
    
    return model

epochs = 30
model.to(device)
model = train_model(model, criterion, optimizer, sched, epochs)

I wanted to be able to monitor my epochs easily and also keep track of the time elapsed as my model was running. The code above includes both, and the results are pretty good! You can see that the model is quickly learning and the accuracy on the validation set quickly reached over 95% by epoch 7!

Epoch 1/30
----------
train Loss: 2.4793 Acc: 0.4791
valid Loss: 0.9688 Acc: 0.8191

Epoch 2/30
----------
train Loss: 0.8288 Acc: 0.8378
valid Loss: 0.4714 Acc: 0.9010

Epoch 3/30
----------
train Loss: 0.5191 Acc: 0.8890
valid Loss: 0.3197 Acc: 0.9181

Epoch 4/30
----------
train Loss: 0.4064 Acc: 0.9095
valid Loss: 0.2975 Acc: 0.9169

Epoch 5/30
----------
train Loss: 0.3401 Acc: 0.9214
valid Loss: 0.2486 Acc: 0.9401

Epoch 6/30
----------
train Loss: 0.3111 Acc: 0.9303
valid Loss: 0.2153 Acc: 0.9487

Epoch 7/30
----------
train Loss: 0.2987 Acc: 0.9298
valid Loss: 0.1969 Acc: 0.9584

...

Training complete in 67m 43s
Best val Acc: 0.973105

You can see that running this code on Google Colab with GPU took just over an hour.

Now it’s time for evaluation

model.eval()

accuracy = 0

for inputs, labels in dataloaders['valid']:
    inputs, labels = inputs.to(device), labels.to(device)
    outputs = model(inputs)
    
    # Class with the highest probability is our predicted class
    equality = (labels.data == outputs.max(1)[1])

# Accuracy = number of correct predictions divided by all predictions
    accuracy += equality.type_as(torch.FloatTensor()).mean()
    
print("Test accuracy: {:.3f}".format(accuracy/len(dataloaders['valid'])))

Test accuracy: 0.973

It’s important to save your checkpoint

model.class_to_idx = image_datasets['train'].class_to_idx

checkpoint = {'input_size': 2208,
              'output_size': 102,
              'epochs': epochs,
              'batch_size': 64,
              'model': models.densenet161(pretrained=True),
              'classifier': classifier,
              'scheduler': sched,
              'optimizer': optimizer.state_dict(),
              'state_dict': model.state_dict(),
              'class_to_idx': model.class_to_idx
             }
   
torch.save(checkpoint, 'checkpoint.pth')

You don’t have to save all of the parameters, but I’m including them here as an example. This checkpoint specifically saves the model with a pre-trained densenet161 architecture, but if you want to save your checkpoint with the two-choice option, you can absolutely do that. Simply adjust the input size and model.

Now you’re able to load your checkpoint. If you’re submitting your project into the Udacity workspace, things can get a little tricky. Here’s some help with troubleshooting your checkpoint load.

You can check your keys by running

ckpt = torch.load('checkpoint.pth')
ckpt.keys()

Then load and rebuild your model!

def load_checkpoint(filepath):
    checkpoint = torch.load(filepath)
    model = checkpoint['model']
    model.classifier = checkpoint['classifier']
    model.load_state_dict(checkpoint['state_dict'])
    model.class_to_idx = checkpoint['class_to_idx']
    optimizer = checkpoint['optimizer']
    epochs = checkpoint['epochs']
    
    for param in model.parameters():
        param.requires_grad = False
        
    return model, checkpoint['class_to_idx']

model, class_to_idx = load_checkpoint('checkpoint.pth')

Want to keep going? It’s a good idea to do some image preprocessing and inference for classification. Go ahead and define your image path and open an image:

image_path = 'flower_data/valid/102/image_08006.jpg'
img = Image.open(image_path)

Process your image and take a look at a processed image:

def process_image(image):
    ''' Scales, crops, and normalizes a PIL image for a PyTorch model,
        returns an Numpy array
    '''
    # Process a PIL image for use in a PyTorch model
    # tensor.numpy().transpose(1, 2, 0)
    preprocess = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], 
                             std=[0.229, 0.224, 0.225])
    ])
    image = preprocess(image)
    return image

def imshow(image, ax=None, title=None):
    """Imshow for Tensor."""
    if ax is None:
        fig, ax = plt.subplots()
    
    # PyTorch tensors assume the color channel is the first dimension
    # but matplotlib assumes is the third dimension
    image = image.numpy().transpose((1, 2, 0))
    
    # Undo preprocessing
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    image = std * image + mean
    
    # Image needs to be clipped between 0 and 1 or it looks like noise when displayed
    image = np.clip(image, 0, 1)
    
    ax.imshow(image)
    
    return ax

with Image.open('flower_data/valid/102/image_08006.jpg') as image:
    plt.imshow(image)

model.class_to_idx = image_datasets['train'].class_to_idx

Create a function for prediction:

def predict2(image_path, model, topk=5):
    ''' Predict the class (or classes) of an image using a trained deep learning model.
    '''
    
    # Implement the code to predict the class from an image file
    img = Image.open(image_path)
    img = process_image(img)
    
    # Convert 2D image to 1D vector
    img = np.expand_dims(img, 0)
    
    
    img = torch.from_numpy(img)
    
    model.eval()
    inputs = Variable(img).to(device)
    logits = model.forward(inputs)
    
    ps = F.softmax(logits,dim=1)
    topk = ps.cpu().topk(topk)
    
    return (e.data.numpy().squeeze().tolist() for e in topk)

Once the images are in the correct format, you can write a function to make predictions with your model. One common practice is to predict the top 5 or so (usually called top-KK) most probable classes. You’ll want to calculate the class probabilities then find the KK largest values.

To get the top KK largest values in a tensor use k.topk(). This method returns both the highest k probabilities and the indices of those probabilities corresponding to the classes. You need to convert from these indices to the actual class labels using class_to_idx, which you added to the model or from the Image Folder you used to load the data. Make sure to invert the dictionary so you get a mapping from index to class as well.

This method should take a path to an image and a model checkpoint, then return the probabilities and classes.

img_path = 'flower_data/valid/18/image_04252.jpg'
probs, classes = predict2(img_path, model.to(device))
print(probs)
print(classes)
flower_names = [cat_to_name[class_names[e]] for e in classes]
print(flower_names)

I was pretty pleased with how my model performed!

[0.9999195337295532, 1.4087702766119037e-05, 1.3897360986447893e-05, 1.1400215043977369e-05, 6.098791800468462e-06]
[12, 86, 7, 88, 40]
['peruvian lily', 'desert-rose', 'king protea', 'magnolia', 'sword lily']

Basically, it’s nearly 100% likely that the image I specified is a Peruvian Lily. Want to take a look? Try using matplotlib to plot the probabilities for the top five classes in a bar graph along with the input image:

def view_classify(img_path, prob, classes, mapping):
    ''' Function for viewing an image and it's predicted classes.
    '''
    image = Image.open(img_path)

fig, (ax1, ax2) = plt.subplots(figsize=(6,10), ncols=1, nrows=2)
    flower_name = mapping[img_path.split('/')[-2]]
    ax1.set_title(flower_name)
    ax1.imshow(image)
    ax1.axis('off')
    
    y_pos = np.arange(len(prob))
    ax2.barh(y_pos, prob, align='center')
    ax2.set_yticks(y_pos)
    ax2.set_yticklabels(flower_names)
    ax2.invert_yaxis()  # labels read top-to-bottom
    ax2.set_title('Class Probability')

view_classify(img_path, probs, classes, cat_to_name)

You should see something like this:

I’ve got to say, I’m pretty happy with that! I recommend testing a few other images to see how close your predictions are on a variety of images.

Now it’s time to make an image classifier of your own! Let me know how it goes in the responses below.

Photo by Pez González on Unsplash

Have you finished your deep learning or machine learning model, but you don’t know what to do with it next? Why not deploy it to the internet?

Get your model out there so everyone can see it!

Check out this article to learn how to deploy your machine learning model with Flask!

Thank you for reading!

The post How to build an image classifier with greater than 97% accuracy appeared first on Content Simplicity.

Getting started with Google Colab

Anne B — Mon, 08 Apr 2019 15:45:37 +0000

A Simple Tutorial for the Frustrated and Confused

Photo by FuYong Hua on Unsplash

(This article first appeared on Towards Data Science)

You know it’s out there. You know there’s free GPU somewhere, hanging like a fat, juicy, ripe blackberry on a branch just slightly out of reach.

Beautiful lightning-fast speed waiting just for you.

Wondering how on earth to get it to work? You’re in the right place!

Photo by Breno Machado on Unsplash

For anyone who doesn’t already know, Google has done the coolest thing ever by providing a free cloud service based on Jupyter Notebooks that supports free GPU. Not only is this a great tool for improving your coding skills, but it also allows absolutely anyone to develop deep learning applications using popular libraries such as PyTorch, TensorFlow, Keras, and OpenCV.

Colab provides GPU and it’s totally free. Seriously!

There are, of course, limits. (Nitty gritty details are available on their faq page, of course.) It supports Python 2.7 and 3.6, but not R or Scala yet. There is a limit to your sessions and size, but you can definitely get around that if you’re creative and don’t mind occasionally re-uploading your files…

Colab is ideal for everything from improving your Python coding skills to working with deep learning libraries, like PyTorch, Keras, TensorFlow, and OpenCV. You can create notebooks in Colab, upload notebooks, store notebooks, share notebooks, mount your Google Drive and use whatever you’ve got stored in there, import most of your favorite directories, upload your personal Jupyter Notebooks, upload notebooks directly from GitHub, upload Kaggle files, download your notebooks, and do just about everything else that you might want to be able to do.

It’s awesome.

Working in Google Colab for the first time has been totally phenomenal and pretty shockingly easy, but it hasn’t been without a couple of small challenges! If you know Jupyter Notebooks at all, you’re pretty much good to go in Google Colab, but there are just a few little differences that can make the difference between flying off to freedom on the wings of free GPU and sitting at your computer, banging your head against the wall…

Photo by Gabriel Matula on Unsplash

This article is for anyone out there who is confused, frustrated, and just wants this thing to work!

Setting up your drive

Create a folder for your notebooks

(Technically speaking, this step isn’t totally necessary if you want to just start working in Colab. However, since Colab is working off of your drive, it’s not a bad idea to specify the folder where you want to work. You can do that by going to your Google Drive and clicking “New” and then creating a new folder. I only mention this because my Google Drive is embarrassingly littered with what looks like a million scattered Colab notebooks and now I’m going to have to deal with that.)

If you want, while you’re already in your Google Drive you can create a new Colab notebook. Just click “New” and drop the menu down to “More” and then select “Colaboratory.”

Otherwise, you can always go directly to Google Colab.

Game on!

You can rename your notebook by clicking on the name of the notebook and changing it or by dropping the “File” menu down to “Rename.”

Set up your free GPU

Want to use GPU? It’s as simple as going to the “runtime” dropdown menu, selecting “change runtime type” and selecting GPU in the hardware accelerator drop-down menu!

Get coding!

You can easily start running code now if you want! You are good to go!

Make it better

Want to mount your Google Drive? Use:

from google.colab import drive
drive.mount('/content/gdrive')

(Run the cell, click the link, copy the code on the page, paste it in the box, hit enter, and you’ll see this when you’ve successfully mounted your drive):

Now you can see your drive right there on the left-hand side of the screen! (You may need to hit “refresh.”) Plus, you can reach your drive any time with

!ls "/content/gdrive/My Drive/"

If you’d rather download a shared zip file link, you can use:

!wget 
!unzip

For example:

!wget -cq https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip
!unzip -qq flower_data.zip

That will give you Udacity’s flower data set in seconds!

If you’re uploading small files, you can just upload them directly with some simple code. However, if you want to, you can also just go to the left side of the screen and click “upload files” if you don’t feel like running some simple code to grab a local file.

Google Colab is incredibly easy to use on pretty much every level, especially if you’re at all familiar with Jupyter Notebooks. However, grabbing some large files and getting a couple of specific directories to work did trip me up for a minute or two.

I covered getting started with Kaggle in Google Colab in a separate article, so if that’s what interests you, please check that out!

Importing libraries

Imports are pretty standard, with a few exceptions.

For the most part, you can import your libraries by running import like you do in any other notebook.

PyTorch is different! Before you run any other Torch imports, you’ll want to run

*** UPDATE! (01/29)*** Colab now supports native PyTorch!!! You shouldn’t need to run the code below, but I’m leaving it up just in case anyone is having any issues!

!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.4.1-{platform}-linux_x86_64.whl torchvision
import torch

Then you can continue with your imports. If you try to simply run import torch you’ll get an error message. I really recommend clicking on the extremely helpful links that pop up. If you do, you’ll get that code right away and you can just click on “INSTALL TORCH” to import it into your notebook. The code will pop up on the left-hand side of your screen, and then hit “INSERT.”

Not able to simply import something else that you want with an import statement? Try a pip install! Just be aware that Google Colab wants an exclamation point before most commands.

!pip install -q keras
import keras

or:

!pip3 install torch torchvision

and:

!apt-get install

is useful too!

I did find that Pillow can be sort of buggy, but you can solve that by running

import PIL
print(PIL.PILLOW_VERSION)

If you get anything below 5.3, go to the “runtime” dropdown menu, restart the runtime, and run the cell again. You should be good to go!

It’s easy to create a new notebook by dropping “File” down to “New Python 3 Notebook.” If you want to open something specific, drop the “File” menu down to “Open Notebook…”

Then you’ll see a screen that looks like this:

As you can see, you can open a recent file, files from your Google Drive, GitHub files, and you can upload a notebook right there as well.

The GitHub option is great! You can easily search by an organization or user to find files. If you don’t see what you’re looking for, try checking the repository drop-down menu!

Always be saving

Saving your work is simple! You can do a good ol’ “command-s” or drop the “File” menu down to save. You can create a copy of your notebook by dropping “File” -> “Save a Copy in Drive.” You can also download your workbook by going from “File” -> “download .ipyb” or “download .py.”

That should be enough to at least get you up and running on Colab and taking advantage of that sweet, sweet free GPU! Please let me know if you run into any other newbie problems that I might be able to help you with. I’d love to help you if I can!

If you’re just getting started with machine learning and AI, I have a few other articles you might want to check out:

Have you finished your deep learning or machine learning model, but you don’t know what to do with it next? Why not deploy it to the internet?

Get your model out there so everyone can see it!

Check out this article to learn how to deploy your machine learning model with Flask!

Photo by Sarah Cervantes on Unsplash

Thanks for reading!

The post Getting started with Google Colab appeared first on Content Simplicity.

Data cleaning and preprocessing for beginners

Anne B — Wed, 03 Apr 2019 09:25:24 +0000

Picture via Pixabay http://pixabay.com

How to successfully prepare your data for a machine learning model in minutes

(This article first appeared in Towards Data Science)

Data cleaning and preprocessing is the first (and arguably most important) step toward building a working machine learning model. It’s critical!

If your data hasn’t been cleaned and preprocessed, your model does not work.

It’s that simple.

Data cleaning is generally thought of as the boring part. But it’s the difference between being prepared and being completely unprepared. It’s the difference between looking like a pro and looking pretty foolish.

It’s kind of like getting ready for a vacation. You might not like the preparation part, but tightening down the details in advance can save you from one nightmare of a trip.

You just have to do it or you can’t start having fun.

But how do you do it?

This tutorial walks you through the basics of preparing any dataset for any machine learning model.

Imports first!

We want to start the data cleaning process by importing the libraries that you’ll need to preprocess your data. A library is really just a tool that you can use. You give the library the input, the library does its job, and it gives you the output you need. There are tons of libraries available, but three are essential libraries in Python. You’ll pretty much wind up using them every time. The three most popular libraries when you’re working with Python are Numpy, Matplotlib, and Pandas. Numpy is the library you’ll need for all things mathematical. Since your code is going to run on math, you’re going to use this one. Matplotlib (specifically Matplotlib.pyplot) is the library you’ll want if you’re going to make charts. Pandas is the best tool available for importing and managing datasets. Pandas and Numpy are basically essential for data preprocessing.

It makes the most sense to import these libraries with a shortcut alias so that you can save a little time later. That’s simple and you can do it like this:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Now you can read in your dataset by typing

dataset = pd.read_csv('my_data.csv')

This tells Pandas (pd) to read in your dataset. These are the first few lines of the dataset I put together for this tutorial:

Now we have our dataset, but we need to create a matrix of dependent variables and a vector of independent variables. You can create the matrix of dependent variables by typing:

X = dataset.iloc[:, :-1].values

That first colon (:)means that we want to grab all of the lines in our dataset. :-1 means that we want to grab all of the columns of data except the last column. The .values on the end means that we want to grab all of the values.

Now we want a vector of dependent variable with only the data from the last column, so we can type

y = dataset.iloc[:, 3].values

Remember when you’re looking at your dataset, the index starts at 0. If you’re trying to count the columns, start counting at 0, not 1. [:, 3] gets you the animal, age, and worth columns. 0 is the animal column, 1 is the age column, and 2 is the worth. You will get used to this counting system if you aren’t already!

What happens if we have missing data?

This actually happens all the time.

We could just remove the lines where data are missing, but that’s a really not the smartest idea. That could easily cause problems. We need to find a better idea! The most common solution is to take the mean of the columns to fill in the missing data point.

You can easily do this with the imputer class from scikit-learn’s preprocessing model. If you don’t know about it already, scikit-learn contains amazing machine learning models and I strongly suggest you check it out!)

You might not be comfortable with terms like “method,” “class,” and “object” as they apply to machine learning. Not a problem!

A class is the model of something that we want to build. If we’re going to build a shed, the construction plan for the shed is the class.

An object is an instance of the class. The object in this example is the shed we built by following the construction plan. There can be many objects of the same class. That’s like saying that you can make lots of sheds from the construction plan.

A method is a tool that we can use on the object, or a function that’s applied to the object that takes some inputs and returns some output. This is like a handle that we can use to open the window when our shed is starting to get a little stuffy.

Photo by Roman Kraft on Unsplash

To use the imputer, we would run something like this

from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values = np.nan, strategy = ‘mean’, axis = 0)

Mean is the default strategy, so you don’t actually need to specify that, but it’s here so you can get a sense of what information you want to include. The default values for missing_values is nan. If your data set has missing values that are called “NaN,” you‘ll stick with np.nan. Check out the official documentation here!

Now to fit this imputer, we type

imputer = imputer.fit(X[:, 1:3])

We only want to fit the imputer to the columns where data are missing. The first colon means that we want to include all of the lines, while 1:3 means that we’re taking column indexes 1 and 2. Don’t worry. You’ll get used to the way Python counts in no time!

Now we want to use the method that will actually replace the missing data. You’ll set that up by typing

X[:, 1:3] = imputer.transform(X[:, 1:3])

Try this out with other strategies! You might find that it makes more sense for your project to fill in the missing values with the median of the column. Or the mode! Decisions like these seem small, but they actually hold a lot of importance.

Just because something is popular doesn’t necessarily make it the right choice. The average (mean) of your data points isn’t necessarily the best choice for your model.

After all, nearly everyone reading this article has an above average number of arms…

Photo by Matthew Henry on Unsplash

What if you have categorical data?

Great question! You can’t exactly take the mean of cat, dog, and moose. What can we do? We can encode the categorical values as numbers! You’ll want to grab the Label Encoder class from sklearn.preprocessing.

Start with one column where you want to encode the data and call the label encoder. Then fit it onto your data

from sklearn.preprocessing import LabelEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])

(Remember how the numbers in the brackets work? : means that we want to work with all of the lines and 0 means that we want to grab the first column.)

That’s all it takes to replace the categorical variables in your first column with numbers. For example, instead of moose, you’ll have “0,” instead of “dog” you’ll have “2,” and instead of “cat,” you’d have “3.”

Do you see the potential problem?

That system of labeling implies a hierarchical value to the data that could affect your model. 3 has a higher value than 0, but cat is not (necessarily…) greater than moose.

Photo by Cel Lisboa on Unsplash

We need to create dummy variables! Dummy variables are an awesome option for data cleaning and preprocessing.

We can create one column for cat, one for moose, and so on. Then we’ll fill the columns in with 1s and 0s (think 1=yes and 0=no.) That means that if you had cat in your original column, now you’d have a 0 in the moose column, a 0 in the dog column, and a 1 in the cat column.

That sounds complicated. Enter One Hot Encoder!

Import the encoder and then specify the index of the column

from sklearn.preprocessing import OneHotEncoder
onehotencoder = OneHotEncoder(categorical_features = [0])

Now a little fit and transform

X = onehotencoder.fit_transform(X).toarray()

Voila! Your single column has been replaced by one column for each of the categorical variables that you had in your original column and it has 1s and 0s replacing the categorical variables.

Pretty sweet, right?

We can go ahead and use label encoder for our y column if we have categorical variables like “yes” and “no.”

labelencoder_y = LabelEncoder()
y = labelencoder_y.fit_transform(y)

This will go ahead and fit and transform y into an encoded variable with 1 for yes and 0 for no.

Train test split

At this point, you can go ahead and split your data into training and testing sets. I know I already said this in the image classification tutorial, but always separate your data into training and testing sets and never use your testing data for training! You need to avoid overfitting. (You can think of overfitting like memorizing super specific details before a test without understanding the information. When you memorize details, you’ll do a great job with your flashcards at home. You’ll fail any real test, though, where you’re presented with new information.)

Right now, we have a machine that needs to learn something. It needs to train on data and see how well it understands what it’s learned on separate data. Memorizing the training set is not the same thing as learning! The better your model learns on the training set, the better it will be at predicting the results for the testing set. You never want to overfit your model. You really want it to learn!

Photo by Janko Ferlič on Unsplash

First, we import

from sklearn.model_selection import train_test_split

Now we can create X_train and X_test and y_train and y_test sets.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

It’s very common to do an 80/20 split of your data, with 80% of your data going to training and 20% to testing. That’s why we specified a test_size of 0.2. You can split it however you need to. You don’t need to set a random state, but I like to do that so that we can exactly reproduce our results.

Now for feature scaling.

What is feature scaling? Why do we need it?

Well, look at our data. We have one column with animal ages from 4–17 and we have animal worth that ranges from $48,000-$83,000. Not only is the worth column made up of much higher numbers than the age column, but the variables also cover a much wider range of data. That means that the Euclidean distance will be dominated by worth and will wind up dominating the age data.

What if Euclidean distance doesn’t play a part in your specific machine learning model? Scaling the features will still make the model much faster, so you might want to include this step when you’re preprocessing your data.

There are many ways to do feature scaling. They all mean that we’re putting all of our features into the same scale so that none are dominated by another.

Start with the import (you must be getting used to that)

from sklearn.preprocessing import StandardScaler

Then create an object that we’ll scale and call the standard scaler

sc_X = StandardScaler()

Now we directly fit and transform our dataset. Grab the object and apply the methods.

X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)

We don’t need to fit it to our test set, we just need a transform.

sc_y = StandardScaler()
y_train = sc_y.fit_transform(y_train)

What about the dummy variables? Do you need to scale them?

Well, some people say yes and some say no. It’s a question of how much you want to hang on to your interpretation. It is good to have all of our data at the same scale. But if we scale our data, we lose our ability to easily interpret which observations belong to which variable.

What about y? If you have a dependent variable like 0 and 1, you really don’t need to apply feature scaling. It’s a classification problem with a categorically dependent value. But if you have a large range of feature values, then yes! You do want to apply the scaler!

You did it!

That’s it!

Photo by Roven Images on Unsplash

With just a handful of lines of code, you’ve taken care of the basics of data cleaning and preprocessing! You can see the code here if want to take a look.

There will definitely be a ton of thought that you’ll need to put into this step. You want to think about exactly how you’re going to fill in your missing data. Consider whether you need to scale your features and how you want to do it. Dummy variables or no? Are you going to encode your data? Will you encode your dummy variables? There are a ton of details to consider here. Nobody said data cleaning would be easy!

That said, you’ve got this!

Now get out there and get that data ready!

Are you curious about deep learning? You might want to take a look at Intro to Deep Learning!

Need some free GPU, but not sure where to find it? Check out Getting Started with Google Colab.

Have you already finished a machine learning model, but you don’t know what to do with it next? Why not deploy it to the internet?

Check out this article to learn how to deploy your machine learning model with Flask!

As always, if you’re doing anything cool with this information, let people know about it in the responses below or reach out any time on Twitter @annebonnerdata!

The post Data cleaning and preprocessing for beginners appeared first on Content Simplicity.

Simply deep learning: an effortless introduction

Anne B — Mon, 04 Mar 2019 04:13:44 +0000

This article first appeared on Towards Data Science.

What is an artificial neural network, how does it work, and what does it have to do with deep learning?

Let’s start with a quick recap from part 1 for anyone who hasn’t looked at it:

What is deep learning?

It’s learning from examples. That’s pretty much the deal.

The inspiration for deep learning is the way that the human brain filters information. Its purpose is to mimic how the human brain works to create some real magic.

Deep learning attempts to mimic the activity in layers of neurons in the neocortex.

It’s very literally an artificial neural network.

In the human brain, there are about 100 billion neurons. Each neuron connects to about 100,000 of its neighbors. That is what we’re trying to create, but in a way and at a level that works for machines.

Image by geralt on Pixabay

What does this mean in terms of neurons, axons, dendrites, and so on? Well, the neuron has a body, dendrites, and an axon. The signal from one neuron travels down the axon and transfers to the dendrites of the next neuron. That connection where the signal passes is called a synapse.

Image by mohamed_hassan on Pixabay

So the neuron (or node) gets a signal or signals (input values), which pass through the neuron. That neuron delivers the output signal. Think of the input layer as your senses: the things you, for example, see, smell, and feel. These are independent variables for one single observation. This information is broken down into numbers and the bits of binary data that a computer can use. (You will need to either standardize or normalize these variables so that they’re within the same range.)

How do artificial neural networks learn?

There are two different approaches to get a program to do what you want. First, there’s the specifically guided and hard-programmed approach. In this approach, you tell the program exactly what you want it to do. Then there are neural networks. In neural networks, you tell your network the inputs and what you want for the outputs, and let it learn on its own. By allowing the network to learn on its own, we can avoid the necessity of entering in all the rules. For a neural network, you can create the architecture and then let it go and learn. Once it’s trained up, you can give it a new image and it will be able to distinguish output.

Photo by Annie Spratt on Unsplash

There are different kinds of neural networks. They’re generally classified into feedforward and feedback networks.

The majority of modern deep learning architectures are based on artificial neural networks (ANNs). They use many layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output of the previous layer for its input. What they learn forms a hierarchy of concepts. In this hierarchy, each level learns to transform its input data into a more and more abstract and composite representation.

Image by ahmedgad on Pixabay

What happens inside the neuron? The input node takes in information that in a numerical form. The information is presented as an activation value where each node is given a number. The higher the number, the greater the activation.

Based on the connection strength (weights) and transfer function, the activation value passes to the next node. Each of the nodes sums the activation values that it receives (it calculates the weighted sum) and modifies that sum based on its transfer function. Next, it applies an activation function. An activation function is a function that’s applied to this particular neuron. From that, the neuron understands if it needs to pass along a signal or not. The activation runs through the network until it reaches the output nodes. The output nodes then give us the information in a way that we can understand. Your network will use a cost function to compare the output and the actual expected output. The model performance is evaluated by the cost function. It’s expressed as the difference between the actual value and the predicted value. There are many different cost functions you can use, you’re looking at what the error you have in your network is. You’re working to minimize loss function. (In essence, the lower the loss function, the closer it is to your desired output). The information goes back, and the neural network begins to learn with the goal of minimizing the cost function by tweaking the weights. This process is called backpropagation.

Interested in learning more about cost functions? Check out A List of Cost Functions Used in Neural Networks, Alongside Applications on Stack Exchange

In forward propagation, information is entered into the input layer and propagates forward through the network to get our output values. We compare the values to our expected results. Next, we calculate the errors and propagate the info backward. This allows us to train the network and update the weights. Backpropagation allows us to adjust all the weights simultaneously. During this process, because of the way the algorithm is structured, you’re able to adjust all of the weights simultaneously. This allows you to see which part of the error each of your weights in the neural network is responsible for.

Hungry for more? You might want to read Efficient BackProp by Yann LeCun, et al., as well as Neural Networks and Deep Learning by Michael Nielsen.

When you’ve adjusted the weights to the optimal level, you’re ready to proceed to the testing phase!

What is a weighted sum?

What’s an activation function?

In a nutshell, the activation function of a node defines the output of that node.

The activation function (or transfer function) translates the input signals to output signals. It maps the output values on a range like 0 to 1 or -1 to 1. It’s an abstraction that represents the rate of action potential firing in the cell. It’s a number that represents the likelihood that the cell will fire. At it’s simplest, the function is binary: yes (the neuron fires) or no (the neuron doesn’t fire). The output can be either 0 or 1 (on/off or yes/no), or it can be anywhere in a range. If you were using a function that maps a range between 0 and 1 to determine the likelihood that an image is a cat, for example, an output of 0.9 would show a 90% probability that your image is, in fact, a cat.

Photo by minanafotos on Pixabay

What options do we have? There are many activation functions, but these are the four very common ones:

Threshold function This is a step function. If the summed value of the input reaches a certain threshold the function passes on 0. If it’s equal to or more than zero, then it would pass on 1. It’s a very rigid, straightforward, yes or no function.

Example threshold function

Sigmoid function: This function is used in logistic regression. Unlike the threshold function, it’s a smooth, gradual progression from 0 to 1. It’s very useful in the output layer and is heavily used for linear regression. (Linear regression is one of the most well-known algorithms in statistics and machine learning).

Example sigmoid function

Hyperbolic Tangent Function This function is very similar to the sigmoid function. Unlike the sigmoid function which goes from 0 to 1, the value goes below zero, from -1 to 1. Although this isn’t what happens in biology, this function gives better results when it comes to training neural networks. Neural networks sometimes get “stuck” during training with the sigmoid function. This happens when there’s a lot of strongly negative input that keeps the output near zero, which messes with the learning process.

Example hyperbolic tangent function (tanh)

Rectifier function This might be the most popular activation function in the universe of neural networks. It’s the most efficient and biologically plausible. Even though it has a kink, it’s smooth and gradual after the kink at 0. This means, for example, that your output would be either “no” or a percentage of “yes.” This function doesn’t require normalization or other complicated calculations.

Example rectifier function

Want to dive deeper? Check out Deep Sparse Rectifier Neural Networks by Xavier Glorot, et al.

So let’s say, for example, your desired value is binary. You’re looking for a “yes” or a “no.” Which activation function do you want to use? From the above examples, you could use the threshold function, or you could go with the sigmoid activation function. The sigmoid function would be able to give you the probability of a yes.

Photo by rawpixel on Unsplash

So, how are the weights adjusted, exactly?

You could use a brute force approach to adjust the weights and test thousands of different combinations. Even with the most simple neural network that has only five input values and a single hidden layer, you’ll wind up with 10⁷⁵ possible combinations. Running this on the world’s fastest supercomputer would take longer than the universe has existed so far.

Photo by skorchanov on Pixabay

However, if you go with gradient descent, you can look at the angle of the slope of the weights and find out if it’s positive or negative in order to continue to slope downhill to find the best weights on your quest to reach the global minimum.

Gradient descent is an algorithm for finding the minimum of a function. The analogy you’ll see over and over is that of someone stuck on top of a mountain and trying to get down (find the minima). There’s heavy fog making it impossible to see the path, so she uses gradient descent to get down to the bottom of the mountain. She looks at the steepness of the hill where she is and proceeds down in the direction of the steepest descent. You should assume that the steepness isn’t immediately obvious. Luckily she has a tool that can measure steepness. Unfortunately, this tool takes forever. She wants to use it as infrequently as she can to get down the mountain before dark. The real difficulty is choosing how often she wants to use her tool so she doesn’t go off track. In this analogy, the person is the algorithm. The steepness of the hill is the slope of the error surface at that point. The direction she goes is the gradient of the error surface at that point. The tool she’s using is differentiation (the slope of the error surface can be calculated by taking the derivative of the squared error function at that point). The rate at which she travels before taking another measurement is the learning rate of the algorithm. It’s not a perfect analogy, but it gives you a good sense of what gradient descent is all about. The machine is learning the gradient, or direction, that the model should take to reduce errors.

Stochastic Gradient Descent

Gradient descent requires the cost function to be convex, but what if it isn’t?

Stochastic gradient descent has much higher fluctuations, which allows you to find the global minimum. It’s called “stochastic” because samples are shuffled randomly, instead of as a single group or as they appear in the training set. It looks like it might be slower, but it’s actually faster because it doesn’t have to load all the data into memory and wait while the data is all run together. The main pro for batch gradient descent is that it’s a deterministic algorithm. This means that if you have the same starting weights, every time you run the network you will get the same results. Stochastic gradient descent is always working at random. (You can also run mini-batch gradient descent where you set a number of rows, run that many rows at a time, and then update your weights.)

Loving this? You might want to take a look at A Neural Network in 13 lines of Python-Part 2 Gradient Descent by Andrew Trask and Neural Networks and Deep Learning by Michael Nielsen

So here’s a quick walkthrough of training an artificial neural network with stochastic gradient descent:

1: Randomly initiate weights to small numbers close to 0
2: Input the first observation of your dataset into the input layer, with each feature in one input node.
3: Forward propagation — from left to right, the neurons are activated in a way that each neuron’s activation is limited by the weights. You propagate the activations until you get the predicted result.
4: Compare the predicted result to the actual result and measure the generated error.
5: Backpropagation — from right to left, the error is back propagated. The weights are updated according to how much they are responsible for the error. (The learning rate decides how much we update the weights.)
6: Reinforcement learning (repeat steps 1–5 and update the weights after each observation) OR batch learning (repeat steps 1–5, but update the weights only after a batch of observations).
7: When the whole training set has passed through the ANN, that is one epoch. Repeat with more epochs.

There you have it! Those are the basic ideas behind what’s happening in an artificial neural network.

Photo by Sam Mathews on Unsplash

Still with me? Come on over to part 3!

(If anyone out there has any specific topics they want me to cover, leave a comment in the responses below and I’ll tackle them if I can!)

The post Simply deep learning: an effortless introduction appeared first on Content Simplicity.

How to Set up Kaggle in Google Colab

Anne B — Sun, 03 Mar 2019 22:26:55 +0000

You know where all those datasets are and you know where you want them to go, but how do you easily move your datasets from Kaggle into Google Colab without a lot of complicated madness?

Let me show you!

Discovering the joy that is Google Colab was definitely one of the smartest things I’ve done since getting started with deep learning, machine learning, and AI. Google Colab provides free GPU (for real!) to pretty much anyone who wants it. If you’re just getting started, you need to get on Colab! I wrote another article that covers getting set up in Colab for the first time, but getting Kaggle up and running in Colab really deserves its own article.

Photo by Oscar Söderlund on Unsplash

Although Colab is extremely user-friendly, there are a few details that you might want help with while getting yourself set up.

Kaggle, it turns out, is one of those details.

Kaggle needs a little finesse. A little love. However, if you’re after those sweet, sweet datasets, you want to get this working! It’s actually really simple; there are just a few easy steps you need to take. If you just want to view the code on GitHub and move on with your day (things can get a little…verbose…around here), you are welcome to do so!

Here’s the simplest way I’ve found to access the Kaggle data for the first time:

Getting Started

(One quick note: in order to be able to access the Kaggle data, you’ll need to be signed up with Kaggle (free!) and agree to the terms and conditions of the competition that you want to participate in.)

First, grab your token from Kaggle.

Go to your account page (the drop-down menu in the top right corner of the screen will take you there).

Then scroll down to API and hit “Create New API Token.”

That’s going to download a file called kaggle.json. Make sure you know where this file is! Maybe put it somewhere you can find it…

Just a suggestion.

Open the file and you’ll see something that looks a lot like this:

{“username”:”YOUR-USER-NAME”,”key”:”SOME-VERY-LONG-STRING”}

Have that thing handy for a future copy-and-paste!

Next, go to Colab and start a new notebook. I’m a big fan of getting up and running on GPU right away, and to do that, go to the “runtime” drop-down menu, select “change runtime type” and then select GPU in the “Hardware accelerator” drop-down menu. Then hit SAVE.

Next, you’ll want to install Kaggle. It’s almost exactly like installing it in your Jupyter Notebooks, but Colab wants an exclamation point at the beginning of your code. Just run:

!pip install kaggle

You can use !ls to check if you already have a folder called Kaggle, or just run

!mkdir .kaggle

to create one.

Next, you’ll want to run the cell below, but please pay attention to a couple of things:

there’s no exclamation point on this one
you definitely want to change the username and password to the ones you did that copy-and-paste on from your downloaded Kaggle file!

import json

token = {“username”:”YOUR-USER-NAME”,”key”:”SOME-VERY-LONG-STRING”}

with open('/content/.kaggle/kaggle.json', 'w') as file:
    json.dump(token, file)

I did a copy-and-paste when I ran this code and actually had a little trouble. I have no idea why, but I had to delete and re-type the single apostrophes in the code above to get that cell to run properly. If you’re popping an error code for no discernable reason, give that a try!

Next, run

!cp /content/.kaggle/kaggle.json ~/.kaggle/kaggle.json

and then

!kaggle config set -n path -v{/content}

You’ll get a warning that looks like this:

You can easily fix that by running:

!chmod 600 /root/.kaggle/kaggle.json

After that, you should be able to run

!kaggle datasets list

To access a list of Kaggle datasets.

If you’re looking for a specific dataset, you can run something like

!kaggle datasets list -s sentiment

in order to list, for example, datasets that include “sentiment” in their titles.

Now it’s time to start having real fun!

Downloading the Data

Go to Kaggle, find the dataset you want, and on that page, click the API button (it will copy the code automatically).

You’ll paste that code into your next cell, but make sure you add that exclamation point to the beginning of the cell and add -p /contentto clarify your path.

!kaggle datasets download -d kazanova/sentiment140 -p /content

To unzip your files, run

!unzip *.zip

Welcome to Data Town!!! Want to take a look? Try running:

import pandas as pd

d = pd.read_csv('training.1600000.processed.noemoticon.csv')

d.head()

(substitute a filename in your dataset for the filename above, of course.)

Now get out there and create something amazing!

Photo by Fidel Fernando on Unsplash

If anyone out there does something seriously awesome with their newly-gotten data, I want to hear about it! Please let everyone know what you’ve created in the responses below.

The post How to Set up Kaggle in Google Colab appeared first on Content Simplicity.