Article Sentiment Through AI

This is part 1 on evaluating sentiments using ML/AI of news articles.

This post builds on work from last week as I explore news articles with ML/AI. To recap, I aggregated the top news from CNBC and NYTimes and calculated their overall sentiment score. However, since all the news articles are combined together, there is no way to evaluate them individually.

In this post, I will examine the individual article’s sentiment.

Methodology

Last week I use AWS Comprehend; however, this week I will using Google Cloud Natural Language.

Why the change?

Because of AWS’s limitations. According to their guidelines and limits, the maximum size for sentiment detection is 5KB. That is a mere 2,500 words!If an article goes over 2500 words, I have to split them, and I have to analysis separately. Then, I need to weigh them appropriately and did a final calculation. I am lazy so I seek a better solution. I found it with Google Cloud Natural Language

Note to businesses: This is a reason why customers switch to a different service.

Google Cloud Natural Language

Google Cloud Natural Language derive insights from unstructured text using Google maching learning

Google’s sentiment analysis is less specific than AWS. Google provides two values: magnitude and score. A score of 0 is neutral. A score of less than 0 is considered to have negative emotion and a score that is greater than 0 is considered to have positive emotion. The magnitude indicates the level of emotional content. Pretty vague but let’s take a look at some samples. You can read more about Google’s sentiment analysis here.

Results

Enough theory! Let’s analyze some examples of the output and see if they make sense.

Here is the results for the average magnitude and score for CNBC and NYTimes:

source avg(gcloud_magnitude) avg(gcloud_score)
cnbc 4.656716405678151260447761194030 -0.082835822463480393880597014925
nytimes 14.884188043510812147863247863248 -0.060256411440861528376068376068

NYTimes is more emotional based on its average gcloud_magnitude score, 4.65 vs 14.88. The sentiment score for both is very close to 0 so they are both neither positive or negative.

From last article’s analysis, CNBC has a 92% and NYTimes has a 87% probability of being neutral respectively. Both AWS and Google seems to agree that the sentiments are most likely neutral.

Individual Article Analysis

Here is the focus point of this article, let’s evaluate one of the articles.

CNBC With Higest Magnitude
url gcloud_magnitude gcloud_score
https://www.cnbc.com/2018/09/18/iphone-xs-and-iphone-xs-max-review.html 30.3999996185302700 0.2000000029802322

This CNBC article reviews Apple’s new iPhones and it generates a magnitude score of 30.39, much higher than the average score of 4.65 for CNBC.

I read the article and there are phrases that indicates how emotional dramatic the article is written in. Here are some quotes:

They’re the best phones Apple has ever made.

The iPhone X, even a year later, is still arguably the best phone on the market.

It’s one of the best screens on the market…

The speakers sound awesome.

I love how shiny it is on the new gold and white models.

I love that iOS 12 gives you so much more control over notifications.

…these are the best phones Apple has made…

Judging from some of these statements, I can understand why Google’s algorithm gives it a high magnitude score. It’s full of dramatic adjectives like best and love.

Conclusion

The CNBC article reviewing Apple’s new iPhoneXS do seem dramatic and emotional. It am pretty impressed that Google Cloud Natural Language can understand that. In part 2, I will dive deeper into articles that have low and high scores from both sources.

How did you prepare for AWS Certified Solutions Architect – Associate Level certification?

I receive my AWS Certified Solutions Architect – Associate Level in Feb 2018. First, I think a course on Online Courses – Learn Anything, On Your Schedule | Udemy. Here is the link: AWS Certified Solutions Architect – Associate 2018

Second, I took the AWS CSAA Practice Test from whizlabs.com. It is a paid course but I think it is worth it.

Those two are enough for me and I passed it on the first try albeit barely. I also made sure to redo some of the labs on the udemy course so I can remember it more.

Quora link