
Fake News Detector in Python – Build Your Own Truth Checker Tool
Today, misinformation doesn’t stay limited to social platforms; it spreads across blogs, websites, and messaging apps.
It affects elections, markets, public opinion, and even personal relationships. As developers, we don’t just consume technology — we create solutions.
In this article, I’ll show you how to build a Fake News Detector using Python.
No heavy math, no boring theory — just practical learning.
Why Fake News Detection Matters Today
Every day:
- WhatsApp forwards spread misinformation
- Fake headlines manipulate clicks
- AI-generated content increases confusion
Manually verifying news is slow and unreliable.
This is where Machine Learning helps.
A Fake News Detector:
- Reads news text
- Learns language patterns
- Predicts whether news is Real or Fake
How a Fake News Detector Works (Simple Explanation)
Think of it like spam detection for news.
Step-by-step logic:
- Input → News article text
- Processing → Clean & understand text
- Learning → Compare with past examples
- Prediction → Fake or Real
The computer doesn’t “understand” truth —
it recognizes patterns.
Tools & Libraries Used
We’ll use industry-standard Python libraries:
- pandas – data handling
- numpy – numerical operations
- sklearn – machine learning
- TfidfVectorizer – text understanding
- PassiveAggressiveClassifier – fast & effective model
These are safe, trusted, and AdSense-friendly topics.
Dataset for Fake News Detection
A dataset usually contains:
- text → news content
- label → REAL or FAKE
Example:
| Text | Label |
|---|---|
| Government announces new policy | REAL |
| Celebrity cured cancer with lemon | FAKE |
(You can use Kaggle Fake News datasets or your own collected data.)
Step 1: Import Required Libraries
Install Required Libraries
python3 -m pip install scikit-learn pandas
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import PassiveAggressiveClassifier
from sklearn.metrics import accuracy_score
Step 2: Load the Dataset
data = pd.read_csv("news.csv")
print(data.head())Our dataset should contain:
text
label
Step 3: Split Data into Training & Testing
X = data['text']
y = data['label']
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)Why split?
- Train → learning
- Test → real-world checking
Step 4: Convert Text into Numbers (TF-IDF)
Computers don’t understand words — they understand numbers.
vectorizer = TfidfVectorizer(stop_words='english', max_df=0.7)
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)
TF-IDF:
- Ignores common words (is, the, and)
- Focuses on meaningful terms
Step 5: Train the Fake News Model
model = PassiveAggressiveClassifier(max_iter=50)
model.fit(X_train_tfidf, y_train)Why PassiveAggressive?
- Fast
- Good for text data
- Used in real NLP systems
Step 6: Test the Model Accuracy
y_pred = model.predict(X_test_tfidf)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
Typical accuracy:
- 90%+ with a clean dataset
Step 7: Detect Fake News from User Input
def detect_news(news_text):
transformed = vectorizer.transform([news_text])
prediction = model.predict(transformed)
return prediction[0]
news = "Breaking: Scientists confirm water cures all diseases"
print(detect_news(news))Output:
FAKEReal-World Use Cases
Fake News Detector can be used in:
- News websites
- Browser extensions
- Social media filters
- WhatsApp message verification
- Journalism tools
This is not just a demo project, but a portfolio-ready idea.
Source Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score
# 1. Load dataset
data = pd.read_csv("news.csv")
# Check required columns
if 'text' not in data.columns or 'label' not in data.columns:
raise ValueError("Dataset must contain 'text' and 'label' columns")
# 2. Split data
X = data['text']
y = data['label']
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 3. Text vectorization (TF-IDF)
vectorizer = TfidfVectorizer(
stop_words='english',
max_df=0.7
)
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)
# 4. Train model (Passive-Aggressive via SGDClassifier)
model = SGDClassifier(
loss='hinge', # same as Passive-Aggressive
penalty=None,
learning_rate='pa1',
eta0=1.0,
max_iter=1000,
random_state=42
)
model.fit(X_train_tfidf, y_train)
# 5. Evaluate model
y_pred = model.predict(X_test_tfidf)
accuracy = accuracy_score(y_test, y_pred)
print("✅ Model trained successfully")
print(f"🎯 Accuracy: {accuracy * 100:.2f}%")
# 6. Prediction function
def detect_fake_news(news_text):
text_vector = vectorizer.transform([news_text])
prediction = model.predict(text_vector)
return prediction[0]
# 7. User input loop
print("\n📰 Fake News Detector (type 'exit' to stop)\n")
while True:
user_input = input("Enter news text: ")
if user_input.lower() == "exit":
print("👋 Exiting Fake News Detector")
break
result = detect_fake_news(user_input)
print("🔍 Prediction:", result)
print("-" * 50)Result:
Frequently Asked Questions (FAQ)
Is Fake News Detection legal?
Yes, if used for educational and research purposes.
Can beginners build this?
Absolutely. Basic Python knowledge is enough.
Is this project good for resumes?
Yes, especially for:
- Data Science
- Python Developer
- Machine Learning roles
Final Thoughts
Fake news is a serious digital problem, but technology can fight it.
By building a Fake News Detector:
- You learn real Machine Learning
- You create meaningful software
If you are serious about Python and AI —
this project deserves a place on your skill and resume.
.png)