This is part 1 on evaluating sentiments using ML/AI of news articles.
This post builds on work from last week as I explore news articles with ML/AI. To recap, I aggregated the top news from CNBC and NYTimes and calculated their overall sentiment score. However, since all the news articles are combined together, there is no way to evaluate them individually.
In this post, I will examine the individual article’s sentiment.
Why the change?
Because of AWS’s limitations. According to their guidelines and limits, the maximum size for sentiment detection is 5KB. That is a mere 2,500 words!If an article goes over 2500 words, I have to split them, and I have to analysis separately. Then, I need to weigh them appropriately and did a final calculation. I am lazy so I seek a better solution. I found it with Google Cloud Natural Language
Note to businesses: This is a reason why customers switch to a different service.
Google Cloud Natural Language
Google Cloud Natural Language derive insights from unstructured text using Google maching learning
Google’s sentiment analysis is less specific than AWS. Google provides two values: magnitude and score. A score of 0 is neutral. A score of less than 0 is considered to have negative emotion and a score that is greater than 0 is considered to have positive emotion. The magnitude indicates the level of emotional content. Pretty vague but let’s take a look at some samples. You can read more about Google’s sentiment analysis here.
Enough theory! Let’s analyze some examples of the output and see if they make sense.
Here is the results for the average magnitude and score for CNBC and NYTimes:
NYTimes is more emotional based on its average gcloud_magnitude score, 4.65 vs 14.88. The sentiment score for both is very close to 0 so they are both neither positive or negative.
From last article’s analysis, CNBC has a 92% and NYTimes has a 87% probability of being neutral respectively. Both AWS and Google seems to agree that the sentiments are most likely neutral.
Individual Article Analysis
Here is the focus point of this article, let’s evaluate one of the articles.
CNBC With Higest Magnitude
This CNBC article reviews Apple’s new iPhones and it generates a magnitude score of 30.39, much higher than the average score of 4.65 for CNBC.
I read the article and there are phrases that indicates how emotional dramatic the article is written in. Here are some quotes:
They’re the best phones Apple has ever made.
The iPhone X, even a year later, is still arguably the best phone on the market.
It’s one of the best screens on the market…
The speakers sound awesome.
I love how shiny it is on the new gold and white models.
I love that iOS 12 gives you so much more control over notifications.
…these are the best phones Apple has made…
Judging from some of these statements, I can understand why Google’s algorithm gives it a high magnitude score. It’s full of dramatic adjectives like
The CNBC article reviewing Apple’s new iPhoneXS do seem dramatic and emotional. It am pretty impressed that Google Cloud Natural Language can understand that. In part 2, I will dive deeper into articles that have low and high scores from both sources.