Dev by PuMo¶

https://pumo.io/¶

Classifying the news

Fake and real news dataset on attachments.

In [39]:
import re
import pandas as pd
import numpy as np
# import seaborn as sns
import matplotlib.pyplot as plt

from wordcloud import WordCloud
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split, cross_validate, StratifiedKFold
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix, ConfusionMatrixDisplay
from sklearn.feature_extraction.text import TfidfVectorizer, ENGLISH_STOP_WORDS
from sklearn.feature_selection import SelectPercentile,SelectKBest, chi2
from sklearn.ensemble import RandomForestClassifier

import seaborn as sns
In [4]:
import nltk
# nltk.download()
from nltk.tokenize import TweetTokenizer, word_tokenize
from nltk.corpus import stopwords
In [5]:
import datetime
from pygooglenews import GoogleNews

gn = GoogleNews()

# gn.set_time_range('10/01/2022','11/03/2022')


s = gn.search('Iran', when= '60d')

for entry in s["entries"]:
    print(entry["title"])

# dict_keys(['title', 'title_detail', 'links', 'link', 'id', 'guidislink', 'published', 'published_parsed', 'summary', 'summary_detail', 'source', 'sub_articles'])
2022-10-01
As many as 14,000 arrested in Iran over last six weeks, United Nations says - CNN
Why Aren't Iran's Workers Spearheading a General Strike? - Foreign Policy
Coverage Of Nationwide Protests In Iran On Thursday - ایران اینترنشنال
Iran Protests Surge, Driven by Demonstrators Mourning Their Dead - Bloomberg
Iran: Thousands of Detained Protesters and Activists in Peril - Human Rights Watch
Biden Says Iran Will Be 'Free' in Aside at Campaign Rally - Bloomberg
Opinion | Want to help the Iran protests? Let its soccer team play in the World Cup. - The Washington Post
Iran’s Khamenei Feels Lonely, Isolated, Says Pundit - ایران اینترنشنال
Cleric killed in restive Iranian city, protests rage on - Reuters
"France once again urges Iran to fully respect its international human (...) - France ONU
US Announces New Iran-Related Sanctions - ایران اینترنشنال
Germany Urges Citizens To Leave Iran, Starts Evacuating Embassy - ایران اینترنشنال
Head of Iran's Revolutionary Guards warns that Saturday is 'last day' of protests - CNN
Iran's Weapons Are Slowly Dragging Israel to Ukraine's Defense - Bloomberg
Serious Government Revenue Shortfall Continues In Iran - ایران اینترنشنال
Iran is preparing to send additional weapons including ballistic missiles to Russia to use in Ukraine, western officials say - CNN
Don't Expect Any More Russian Help on the Iran Nuclear Deal - War on the Rocks
US and Saudi Arabia concerned that Iran may be planning attack on energy infrastructure in Middle East - CNN
Iran could supply Russia with ballistic missiles, NATO chief says - Reuters
U.S. wants to oust Iran from U.N. women's commission - Reuters
'Don't stay silent about Iran genocide' Israeli activists urge new gov't - The Jerusalem Post
Inside Iran’s Evin Prison - The Atlantic
Iran's Leader Says Plans to Avenge US Strike on Top General Soleimani Ongoing - Bloomberg
Iran police force says it is investigating protester beating and shooting video - BBC
How did an IDF soldier avoid trouble during emergency landing in Iran? - The Jerusalem Post
US imposes new sanctions on Iranian officials over crackdown on protests - CNN
US Iran envoy says he is focused on 'where we can be useful' and not going to 'waste our time' on nuclear deal right now - CNN
Qatar World Cup: Welsh government boycott Wales-Iran game - BBC
Iran says it will sue US, alleging 'direct involvement' in protests - CNN
EXCLUSIVE - Iran Reportedly Brings Iraqi Allies To Crack Down On Protests - ایران اینترنشنال
Baffled By West’s Reluctance To Negotiate, Iran Is In Tight Spot - ایران اینترنشنال
Tehran Calls Iran International Similar 'To A Terrorist Media' - ایران اینترنشنال
Iran’s Revolutionary Guard issues warning to protestors about ‘end of the riots’ - PBS NewsHour
The Iranian people will no longer tolerate violence and oppression - GOV.UK
Iran Asks Countries Not To Attend UN Meeting On Rights Violations - ایران اینترنشنال
Analysis: Arabs view revived Netanyahu with concern but as balance against Iran - Reuters
Hard To Deal With Massive Corruption In Iran, Says Whistleblower - ایران اینترنشنال
How Iran's protests transformed into a national uprising - CNN
Iran's Secret Manual for Controlling Protesters' Mobile Phones - The Intercept
Iran protests: University students stage sit-down strikes - BBC
From protester to fighter: Fleeing Iran's brutal crackdown to take up arms over the border - CNN
A barrier of fear has been broken in Iran. The regime may be at a point of no return - CNN
Iran faces dilemma as children join protests in 'unprecedented' phenomenon - CNN
Investigation: How Iran’s security forces are shooting to kill with ‘non-combat’ shotgun shells - The France 24 Observers
London-based TV channel sparks Iranian leaders' ire amid protests - CNN
New Zealand suspends bilateral human rights dialogue with Iran - Reuters
Iran Is Now at War With Ukraine - Foreign Policy
Where is the US's red line on Iran's protests? - Middle East Institute
Iran Protests At Point Of ‘No Return’ - Former Hostage - ایران اینترنشنال
Iranian Officials Reportedly Sending Family And Assets Abroad - ایران اینترنشنال
Iran calls Western allegations that it supplied Russia with drones 'disappointing,' calls for peaceful resolution of war - CNBC
A young woman's death in Iran has sparked an uprising. News organizations are grappling with how to cover it - CNN
Joint Statement on Internet Shutdowns in Iran - United States Department of State - Department of State
The doctors risking it all to treat Iran's protesters - CNN
G7 ministers seek to boost unity on Ukraine, China, Iran - ABC News
Mapping Iran's unrest: how Mahsa Amini's death led to nationwide protests - The Guardian
The battle of narratives on Iran is being fought on social media - CNN
Protests Rock Iran Universities Before Nationwide Rallies Wednesday - ایران اینترنشنال
Analysis | What Iran's protest slogans tell us about the uprising - The Washington Post
Biden 'stunned' by Iranian protests: 'It's awakened something that I don't think will be quieted in a long, long time' - CNN
In Iran, the Song 'Baraye' Is Fueling Protests - Foreign Policy
Iran Urges US to Stop ‘Double-Dealing’ in Vienna Talks - Tasnim News Agency
Iran has sent military trainers to Crimea to train Russian forces to use drones - CNN
Iranian police looking into incident involving woman surrounded by officers in street - CNN
At least 2 killed in Iran as security forces intensify crackdown over protests - CNN
Iran's 'women's revolution' could be a Berlin Wall moment - CNN
Iranian teachers call for nationwide strike in protest over deaths and detention of students - CNN
Iran Indicts Hundreds as Students and Strikes Sustain Protests - Bloomberg
Opinion | In Iran, a new generation rises. The theocracy strikes back. - The Washington Post
Coverage Of Nationwide Protests In Iran On October 12 - ایران اینترنشنال
UN human rights chief fears pushback on gender issues, cites Iran clampdown - Reuters
The digital news site breaking stories on Iran's protests - The Washington Post
2 killed in highway ‘riots’ in Iran - The Siasat Daily
Biden Admin Hesitant To Characterize Iran Protests - ایران اینترنشنال
Protests continue in Iran amid lethal crackdown by security forces - CBS News
Iran allows detained New Zealand social media influencers to leave - CNN
A Chance to Be on Right Side of History in Iran - Foreign Policy
As Iran Protests Continue, Outrage Defies Violent Crackdowns - Northeastern University
Coverage Of Nationwide Protests In Iran On October 8 - ایران اینترنشنال
Exiled Prince Says Iran’s Protests Will Change The World - ایران اینترنشنال
Iran’s regime kills a female surgeon during doctors' protests in Tehran - The Jerusalem Post
Biden has it backwards on Iran, Saudi Arabia - The Hill
Coverage of Nationwide Protests In Iran On October 19 - ایران اینترنشنال
Protests In Iran, Abroad Boost Sense Of Unity, Solidarity - ایران اینترنشنال
Iranian official admits that student protesters are being taken to psychiatric institutions - CNN
France urges French nationals to leave Iran "as soon as possible" - CNN
Opinion | ‘It’s Like a War Out There.’ Iran’s Women Haven’t Been This Angry in a Generation. - The New York Times
19 children among 185 killed during protest unrest in Iran, rights group says - Axios
US State Department says Iran nuclear deal 'not our focus right now' - CNN
In [6]:
type(s)
Out[6]:
dict
In [7]:
pd.options.display.max_colwidth = 128

listTmpContent = []
listTmpSource = []
# lol = pd.DataFrame.from_records(s)

for entry in s["entries"]:
    listTmpContent.append(entry["title"])
    listTmpSource.append(entry["source"]["title"])
#     print(entry["title"])
    
    #gNewsDf=pd.concat(entry["title"])
#gNewsDf = gNewsDf.concat([gNewsDf, entry["title"]], ignore_index=True)
# listTmp
gNewsDf = pd.DataFrame({'content': listTmpContent,
                       'source': listTmpSource,})

gNewsDf = gNewsDf.replace(['ایران اینترنشنال'], 'Iran International')
In [8]:
def process_google_news(text):
    # Cleaning the news Title with regex
    punctuation_regex = re.compile(r'[^\w\s]+')
    
    # remove everything after LAST occurrence of character "-"
    X = ([item for item in text.rsplit('-', 1)[0]])
    X = ''.join(X)
    X = punctuation_regex.sub('', str(X))

#     X = text.apply(lambda x: ''.join([item for item in x.rsplit('-', 1)[0]]))
#     X = X.apply(lambda x: punctuation_regex.sub('', str(x)))
    
    return X
In [9]:
gNewsDf['content'].head().apply(process_google_news)
Out[9]:
0    As many as 14000 arrested in Iran over last six weeks United Nations says 
1                        Why Arent Irans Workers Spearheading a General Strike 
2                          Coverage Of Nationwide Protests In Iran On Thursday 
3              Iran Protests Surge Driven by Demonstrators Mourning Their Dead 
4                 Iran Thousands of Detained Protesters and Activists in Peril 
Name: content, dtype: object
In [10]:
gNewsDf['cleancontent'] = gNewsDf['content'].apply(process_google_news)
gNewsDf
Out[10]:
content source cleancontent
0 As many as 14,000 arrested in Iran over last six weeks, United Nations says - CNN CNN As many as 14000 arrested in Iran over last six weeks United Nations says
1 Why Aren't Iran's Workers Spearheading a General Strike? - Foreign Policy Foreign Policy Why Arent Irans Workers Spearheading a General Strike
2 Coverage Of Nationwide Protests In Iran On Thursday - ایران اینترنشنال Iran International Coverage Of Nationwide Protests In Iran On Thursday
3 Iran Protests Surge, Driven by Demonstrators Mourning Their Dead - Bloomberg Bloomberg Iran Protests Surge Driven by Demonstrators Mourning Their Dead
4 Iran: Thousands of Detained Protesters and Activists in Peril - Human Rights Watch Human Rights Watch Iran Thousands of Detained Protesters and Activists in Peril
... ... ... ...
84 Iranian official admits that student protesters are being taken to psychiatric institutions - CNN CNN Iranian official admits that student protesters are being taken to psychiatric institutions
85 France urges French nationals to leave Iran "as soon as possible" - CNN CNN France urges French nationals to leave Iran as soon as possible
86 Opinion | ‘It’s Like a War Out There.’ Iran’s Women Haven’t Been This Angry in a Generation. - The New York Times The New York Times Opinion Its Like a War Out There Irans Women Havent Been This Angry in a Generation
87 19 children among 185 killed during protest unrest in Iran, rights group says - Axios Axios 19 children among 185 killed during protest unrest in Iran rights group says
88 US State Department says Iran nuclear deal 'not our focus right now' - CNN CNN US State Department says Iran nuclear deal not our focus right now

89 rows × 3 columns

In [11]:
content_join = ' '.join(gNewsDf['cleancontent'])
source_join = ' '.join(gNewsDf['source'])

wordcloud_content = WordCloud(stopwords=ENGLISH_STOP_WORDS,
                           background_color='white', 
                           width=1200, height=1000).generate(content_join)
wordcloud_source = WordCloud(stopwords=ENGLISH_STOP_WORDS,
                           background_color='white', 
                           width=1200, height=1000).generate(source_join)

plt.figure(figsize = [8, 7])
plt.imshow(wordcloud_content)
plt.axis('off')
plt.title('Iran News')
plt.show()

plt.figure(figsize = [8, 7])
plt.imshow(wordcloud_source)
plt.axis('off')
plt.title('Iran News Source')
plt.show()
No description has been provided for this image
No description has been provided for this image
In [12]:
# Read Fake News Dataset:

fake_df = pd.read_csv('Fake.csv')
fake_df['label'] = 0
fake_df.head()
Out[12]:
title text subject date label
0 Donald Trump Sends Out Embarrassing New Year’s Eve Message; This is Disturbing Donald Trump just couldn t wish all Americans a Happy New Year and leave it at that. Instead, he had to give a shout out to ... News December 31, 2017 0
1 Drunk Bragging Trump Staffer Started Russian Collusion Investigation House Intelligence Committee Chairman Devin Nunes is going to have a bad day. He s been under the assumption, like many of u... News December 31, 2017 0
2 Sheriff David Clarke Becomes An Internet Joke For Threatening To Poke People ‘In The Eye’ On Friday, it was revealed that former Milwaukee Sheriff David Clarke, who was being considered for Homeland Security Secret... News December 30, 2017 0
3 Trump Is So Obsessed He Even Has Obama’s Name Coded Into His Website (IMAGES) On Christmas day, Donald Trump announced that he would be back to work the following day, but he is golfing for the fourth... News December 29, 2017 0
4 Pope Francis Just Called Out Donald Trump During His Christmas Speech Pope Francis used his annual Christmas Day message to rebuke Donald Trump without even mentioning his name. The Pope deliver... News December 25, 2017 0
In [13]:
# Read True News Dataset:

true_df = pd.read_csv('True.csv')
true_df['label'] = 1
true_df.head()
Out[13]:
title text subject date label
0 As U.S. budget fight looms, Republicans flip their fiscal script WASHINGTON (Reuters) - The head of a conservative Republican faction in the U.S. Congress, who voted this month for a huge e... politicsNews December 31, 2017 1
1 U.S. military to accept transgender recruits on Monday: Pentagon WASHINGTON (Reuters) - Transgender people will be allowed for the first time to enlist in the U.S. military starting on Mond... politicsNews December 29, 2017 1
2 Senior U.S. Republican senator: 'Let Mr. Mueller do his job' WASHINGTON (Reuters) - The special counsel investigation of links between Russia and President Trump’s 2016 election campaig... politicsNews December 31, 2017 1
3 FBI Russia probe helped by Australian diplomat tip-off: NYT WASHINGTON (Reuters) - Trump campaign adviser George Papadopoulos told an Australian diplomat in May 2016 that Russia had po... politicsNews December 30, 2017 1
4 Trump wants Postal Service to charge 'much more' for Amazon shipments SEATTLE/WASHINGTON (Reuters) - President Donald Trump called on the U.S. Postal Service on Friday to charge “much more” to s... politicsNews December 29, 2017 1
In [14]:
df = true_df.copy(deep=True)
# df = df.append(fake_df, ignore_index=True)
df = pd.concat([df, fake_df], ignore_index=True)
df
Out[14]:
title text subject date label
0 As U.S. budget fight looms, Republicans flip their fiscal script WASHINGTON (Reuters) - The head of a conservative Republican faction in the U.S. Congress, who voted this month for a huge e... politicsNews December 31, 2017 1
1 U.S. military to accept transgender recruits on Monday: Pentagon WASHINGTON (Reuters) - Transgender people will be allowed for the first time to enlist in the U.S. military starting on Mond... politicsNews December 29, 2017 1
2 Senior U.S. Republican senator: 'Let Mr. Mueller do his job' WASHINGTON (Reuters) - The special counsel investigation of links between Russia and President Trump’s 2016 election campaig... politicsNews December 31, 2017 1
3 FBI Russia probe helped by Australian diplomat tip-off: NYT WASHINGTON (Reuters) - Trump campaign adviser George Papadopoulos told an Australian diplomat in May 2016 that Russia had po... politicsNews December 30, 2017 1
4 Trump wants Postal Service to charge 'much more' for Amazon shipments SEATTLE/WASHINGTON (Reuters) - President Donald Trump called on the U.S. Postal Service on Friday to charge “much more” to s... politicsNews December 29, 2017 1
... ... ... ... ... ...
44893 McPain: John McCain Furious That Iran Treated US Sailors Well 21st Century Wire says As 21WIRE reported earlier this week, the unlikely mishap of two US Naval vessels straying into Ira... Middle-east January 16, 2016 0
44894 JUSTICE? Yahoo Settles E-mail Privacy Class-action: 4MforLawyers,4MforLawyers,0 for Users 21st Century Wire says It s a familiar theme. Whenever there is a dispute or a change of law, and two tribes go to war, ther... Middle-east January 16, 2016 0
44895 Sunnistan: US and Allied ‘Safe Zone’ Plan to Take Territorial Booty in Northern Syria Patrick Henningsen 21st Century WireRemember when the Obama Administration told the world how it hoped to identify 5,000 re... Middle-east January 15, 2016 0
44896 How to Blow $700 Million: Al Jazeera America Finally Calls it Quits 21st Century Wire says Al Jazeera America will go down in history as one of the biggest failures in broadcast media history.... Middle-east January 14, 2016 0
44897 10 U.S. Navy Sailors Held by Iranian Military – Signs of a Neocon Political Stunt 21st Century Wire says As 21WIRE predicted in its new year s look ahead, we have a new hostage crisis underway.Today, Iran... Middle-east January 12, 2016 0

44898 rows × 5 columns

In [15]:
# empty cells
print(df.columns[df.isnull().any()])
Index([], dtype='object')
In [43]:
fake_text = ' '.join(fake_df['title']) + ' '.join(fake_df['text'])
true_text = ' '.join(true_df['title']) + ' '.join(true_df['text'])

wordcloud_fake = WordCloud(stopwords=ENGLISH_STOP_WORDS,
                           background_color='white', 
                           width=1200, height=1000).generate(fake_text)
wordcloud_true = WordCloud(stopwords=ENGLISH_STOP_WORDS,
                           background_color='white', 
                           width=1200, height=1000).generate(true_text)

plt.figure(figsize = [8, 7])
plt.imshow(wordcloud_fake)
plt.axis('off')
plt.title('Fake News')
plt.show()

plt.figure(figsize = [8, 7])
plt.imshow(wordcloud_true)
plt.axis('off')
plt.title('Real News')
plt.show()
No description has been provided for this image
No description has been provided for this image
In [16]:
df.shape
Out[16]:
(44898, 5)
In [17]:
df.drop_duplicates(inplace=True)
df.shape
Out[17]:
(44689, 5)
In [18]:
df.isnull().sum()
Out[18]:
title      0
text       0
subject    0
date       0
label      0
dtype: int64

Check for Class Imbalance

In [49]:
target = df['label']
sns.set_style('whitegrid')
sns.countplot(x=target)
Out[49]:
<AxesSubplot:xlabel='label', ylabel='count'>
No description has been provided for this image

Count Subjects plot

In [48]:
# Plotting a bar graph of the number of tweets in each location, for the first ten locations listed
# in the column 'location'
subject_count  = df['subject'].value_counts()
# location_count = location_count[:10,]
plt.figure(figsize=(10,5))
sns.barplot(x=subject_count.index, y=subject_count.values, alpha=0.8)
plt.title('DataSet News Subjects')
plt.ylabel('Number of Occurrences', fontsize=12)
plt.xlabel('Subjects', fontsize=12)
plt.show()
No description has been provided for this image
In [24]:
# Data Pre-processing
# Concatenate titles & text
# X = df['title'] + ' ' + df['text']
X = df['title']
y = df['label']

punctuation_regex = re.compile(r'[^\w\s]+')
urls_regex = re.compile(r'(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+'
                        r'[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+['
                        r'a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-'
                        r'zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})')

# Apply data cleaning
X = X.apply(lambda x: urls_regex.sub('', str(x)))
X = X.apply(lambda x: ' '.join([item for item in x.split() if item not in ENGLISH_STOP_WORDS]))
X = X.apply(lambda x: punctuation_regex.sub('', str(x)))

# Split data to 80/20 ratio
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    test_size=0.2,
                                                    random_state=10)
In [26]:
# Random Forests + TF-IDF
# Set up the model pipeline

tweet_tok = TweetTokenizer(strip_handles = True,
                         reduce_len = True)

tf_parameters = {
    'tokenizer': tweet_tok.tokenize,
    'analyzer': 'word',
    'use_idf': False,
    'smooth_idf': True,
    'sublinear_tf': False,
    'ngram_range': (1,2),
    'lowercase': True,
    'stop_words': ENGLISH_STOP_WORDS
}


# tfidf
tfidf = TfidfVectorizer(**tf_parameters)

# feature seelction
fs = SelectPercentile(score_func = chi2, percentile=0.4)

# pipline for the classifier
pipeline = Pipeline(
    [
        ('vect', tfidf),
        ('fs', fs),
        ('clf', RandomForestClassifier(max_features='sqrt', n_estimators=1000, n_jobs=1))
    ]
)

# Creating a StratifiedKFold object with 5 splits
folds = StratifiedKFold(n_splits=5, shuffle=True, random_state=10)

scores = cross_validate(pipeline, X_train, y_train,
                        scoring=['accuracy', 'precision_macro', 'recall_macro', 'f1_macro'],
                        cv=5,
                        n_jobs=1,
                        return_train_score=False)

print('Cross validation scores', scores)

pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)

print(classification_report(y_test, y_pred))
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=pipeline.classes_)
disp.plot()
plt.show()
Cross validation scores {'fit_time': array([109.88962007, 105.761585  , 106.13423753, 106.10964227,
       105.90334606]), 'score_time': array([3.74106693, 3.6589992 , 3.65499902, 3.69725537, 3.64848709]), 'test_accuracy': array([0.8886869 , 0.8848951 , 0.8820979 , 0.88587413, 0.89062937]), 'test_precision_macro': array([0.88888869, 0.88547684, 0.8828864 , 0.88656665, 0.89139498]), 'test_recall_macro': array([0.88988466, 0.88636569, 0.88369034, 0.88740507, 0.8922155 ]), 'test_f1_macro': array([0.88863322, 0.88486321, 0.88207437, 0.88584808, 0.89060711])}
              precision    recall  f1-score   support

           0       0.93      0.88      0.90      4690
           1       0.87      0.93      0.90      4248

    accuracy                           0.90      8938
   macro avg       0.90      0.90      0.90      8938
weighted avg       0.90      0.90      0.90      8938

Accuracy: 0.9013202058626091
No description has been provided for this image
In [36]:
g_news_labels = pipeline.predict(gNewsDf['cleancontent'])
print(g_news_labels)
g_news_labels = np.where(g_news_labels, 'real', 'fake')
print(g_news_labels)
gNewsDf['predictedLabel'] = g_news_labels
gNewsDf
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 1
 1 1 0 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 0 1 1 1 0 1 1
 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1]
['real' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real'
 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real'
 'real' 'real' 'real' 'fake' 'fake' 'real' 'real' 'fake' 'real' 'real'
 'real' 'fake' 'real' 'fake' 'real' 'real' 'real' 'real' 'real' 'fake'
 'fake' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'fake'
 'real' 'fake' 'real' 'real' 'real' 'real' 'fake' 'real' 'real' 'fake'
 'real' 'real' 'real' 'real' 'real' 'real' 'fake' 'fake' 'real' 'real'
 'real' 'fake' 'real' 'real' 'real' 'real' 'real' 'real' 'real' 'real'
 'real' 'real' 'real' 'real' 'fake' 'real' 'fake' 'real' 'real']
Out[36]:
content source cleancontent predictedLabel
0 As many as 14,000 arrested in Iran over last six weeks, United Nations says - CNN CNN As many as 14000 arrested in Iran over last six weeks United Nations says real
1 Why Aren't Iran's Workers Spearheading a General Strike? - Foreign Policy Foreign Policy Why Arent Irans Workers Spearheading a General Strike real
2 Coverage Of Nationwide Protests In Iran On Thursday - ایران اینترنشنال Iran International Coverage Of Nationwide Protests In Iran On Thursday real
3 Iran Protests Surge, Driven by Demonstrators Mourning Their Dead - Bloomberg Bloomberg Iran Protests Surge Driven by Demonstrators Mourning Their Dead real
4 Iran: Thousands of Detained Protesters and Activists in Peril - Human Rights Watch Human Rights Watch Iran Thousands of Detained Protesters and Activists in Peril real
... ... ... ... ...
84 Iranian official admits that student protesters are being taken to psychiatric institutions - CNN CNN Iranian official admits that student protesters are being taken to psychiatric institutions fake
85 France urges French nationals to leave Iran "as soon as possible" - CNN CNN France urges French nationals to leave Iran as soon as possible real
86 Opinion | ‘It’s Like a War Out There.’ Iran’s Women Haven’t Been This Angry in a Generation. - The New York Times The New York Times Opinion Its Like a War Out There Irans Women Havent Been This Angry in a Generation fake
87 19 children among 185 killed during protest unrest in Iran, rights group says - Axios Axios 19 children among 185 killed during protest unrest in Iran rights group says real
88 US State Department says Iran nuclear deal 'not our focus right now' - CNN CNN US State Department says Iran nuclear deal not our focus right now real

89 rows × 4 columns

In [28]:
gNewsDf['predictedLabel'].value_counts()
Out[28]:
real    73
fake    16
Name: predictedLabel, dtype: int64
In [29]:
gNewsDf['source'].value_counts()
Out[29]:
CNN                        25
Iran International         19
Reuters                     6
Bloomberg                   5
The Washington Post         4
Foreign Policy              4
BBC                         3
The Jerusalem Post          3
The Guardian                1
Tasnim News Agency          1
The Siasat Daily            1
Northeastern University     1
CBS News                    1
Department of State         1
The Hill                    1
The New York Times          1
ABC News                    1
The Intercept               1
CNBC                        1
Middle East Institute       1
The France 24 Observers     1
GOV.UK                      1
PBS NewsHour                1
The Atlantic                1
War on the Rocks            1
France ONU                  1
Human Rights Watch          1
Axios                       1
Name: source, dtype: int64
In [33]:
new_df = pd.crosstab(gNewsDf['source'], gNewsDf['predictedLabel'])
# new_df['Fake.Ratio'] 
# new_df['Real.Ratio'] 
In [34]:
new_df
Out[34]:
predictedLabel fake real
source
ABC News 0 1
Axios 0 1
BBC 3 0
Bloomberg 1 4
CBS News 0 1
CNBC 0 1
CNN 6 19
Department of State 0 1
Foreign Policy 0 4
France ONU 0 1
GOV.UK 1 0
Human Rights Watch 0 1
Iran International 2 17
Middle East Institute 0 1
Northeastern University 0 1
PBS NewsHour 0 1
Reuters 0 6
Tasnim News Agency 0 1
The Atlantic 0 1
The France 24 Observers 0 1
The Guardian 0 1
The Hill 0 1
The Intercept 0 1
The Jerusalem Post 1 2
The New York Times 1 0
The Siasat Daily 0 1
The Washington Post 1 3
War on the Rocks 0 1
In [38]:
# Creating barplot
pl = new_df.plot(kind="bar", stacked=True, rot=90)
pl
Out[38]:
<AxesSubplot:xlabel='source'>
No description has been provided for this image