View profile

Breaking the Jargons - Issue #5

Parul Pandey
Parul Pandey
Vector Norms, new emotion dataset, matplotlib’s interactive backend and more…

Hi there!
Welcome to the newsletter’s 5th edition. As always, there are some hands-on tutorials, reviews, and some interesting concepts and resources. There isn’t any interview this month, but other than that, the composition of the newsletter remains pretty similar to its previous editions.
📜 Articles
Vector norms occupy an important space in the context of machine learning especially when computing similarities between items. This article breaks down the idea of L¹, L², L∞, and the Lᵖ norms in simple terms.
Emotion Recognition is a common classification task. However, most of the datasets available for this purpose consist of only two polarities — positive, negative, and at times neutral. This article explores a new emotion dataset that consists of eight basic emotions, including anger, anticipation, disgust, fear, joy, sadness, surprise, and trust
A few months back, the TensorFlow Decision Forests, aka TF-DF library, was open-sourced by Google. TF-DF provides a unified API for both tree-based models as well as neural networks. It is helpful for training, serving, inferencing, and interpreting the Decision Forest models.
Data Visualization
Matplotlib caters to different users and hence supports various backends. This article presents two matplotlib backends that render interactive images in the notebooks - the nbagg backend and the ipyml backend.
Interactive matplotlib plot with Ipyml backend
Interactive matplotlib plot with Ipyml backend
💡 Concept corner
Whenever a column in a CSV file contains a thousand separator, pandas.read_csv() reads it as a string rather than an integer. To avoid this, one needs to explicitly tell the pandas.read_csv() function that comma is a thousand place indicator with the help of the thousands parameter.
🎁 Resource of the Month
The CS 329P: Practical Machine Learning course with Qingqing Huang, Mu Li, Alex Smola is currently live and has been made free for everyone. Some of the topics that will be covered in the course are AutoML, Distillation, Distributed training, Model Serving, Distribution Tests, Fairness, and much more. Join them if you are interested in knowing the practical aspects of applied machine learning.
That is all for this edition. See you with another roundup next month. You can subscribe to receive the newsletter directly in your mailbox every month or share it with someone who could find them helpful.
Until next month,
Did you enjoy this issue? Yes No
Parul Pandey
Parul Pandey @pandeyparul

Breaking down data science jargon, an article a time.

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Created with Revue by Twitter.