To be added
Set of notebooks associated with the chapter.
-
Create a wordcloud: How to create a word cloud. This is often used to get a quick sense of given text corpus at hand.
-
Effect of different tokenizers on Social Media Text Data : Here we show how different tokenizers can give different output for the same input text. When dealing with text data from social platforms this can have a huge bearing on the performance of the task. Here, we will be working with 5 different tokenizers, namely:
* word_tokenize from NLTK * TweetTokenizer from NLTK * Twikenizer * Twokenizer by ARK@CMU * twokenize
-
Trending topics: Find trending topics on Twitter using tweepy
-
Sentiment Analysis: Basic sentiment analysis using TextBlob
-
Preprocessing Social Media Text Data: Common functions involved in the pre-processing pipeline for Social Media Text Data.
-
Text representation of Social Media Text Data: How to use embeddings to represent Social Media Text Data
-
Sentiment Analysis: Here we use the preprocessing and representation steps learnt before to build a better classifier.
Color figures as requested by the readers.