Building a Smart Content Censorship System : A Step-by-Step Guide
Content Censorship System in Python
Censoring text in Python refers to the process of identifying and replacing offensive, inappropriate, or sensitive words or phrases in a text. This is often done to ensure that the content is suitable for various audiences, especially in applications like social media, forums, or chat platforms.
There are several ways to censor text in Python, including:
- Using pre-defined word lists.
- Using libraries that provide censorship functionality.
- Machine learning models to detect offensive language.
If you’re looking for a Python package that handles text censorship, you can use profanity-check or better_profanity. These packages are designed to detect and censor profane or offensive language.
1. Using better_profanity
Installation:
pip install better_profanity
Usage:
from better_profanity import profanity
# Load a custom list of words (optional)
profanity.load_censor_words(["bad", "offensive"])
# Censor text
text = "This is some bad and offensive text."
censored_text = profanity.censor(text)
print("Original text:", text)
print("Censored text:", censored_text)
Output:
Original text: This is some bad and offensive text.
Censored text: This is some **** and ******** text.
2. Using profanity-check
Installation
pip install profanity-check
Usage
from profanity_check import predict, predict_prob
text = "This is a bad text."
texts = ["This is fine.", "This is offensive."]
# Predict if a text is offensive (1 = offensive, 0 = not offensive)
print(predict([text]))
# Predict the probability of being offensive
print(predict_prob(texts))
profanity-check is more suited for detecting profane or offensive language, rather than replacing it with censored text.
Which One to Use?
- Use better_profanity if you want a simple and customizable censoring tool.
- Use profanity-check if you need a machine-learning-based approach to detect offensive content.
Here’s an example of how to create a Flask API that uses both better_profanity
for censoring text and profanity-check
for detecting offensive language in content censorship system.
Install the required libraries
pip install flask better_profanity profanity-check
Flask API Code
from flask import Flask, request, jsonify
from better_profanity import profanity
from profanity_check import predict, predict_prob
app = Flask(__name__)
# Load custom censor words for better_profanity (optional)
profanity.load_censor_words(["bad", "offensive"])
@app.route("/censor", methods=["POST"])
def censor_text():
"""
API endpoint to censor offensive words in a given text.
"""
data = request.json
text = data.get("text", "")
censored_text = profanity.censor(text)
return jsonify({"original_text": text, "censored_text": censored_text})
@app.route("/detect", methods=["POST"])
def detect_offensive():
"""
API endpoint to detect if text contains offensive content.
"""
data = request.json
text = data.get("text", "")
# Predict offensive probability
is_offensive = predict([text])[0] # 1 for offensive, 0 for not offensive
offensive_prob = predict_prob([text])[0]
return jsonify({
"text": text,
"is_offensive": bool(is_offensive),
"offensive_probability": offensive_prob
})
if __name__ == "__main__":
app.run(debug=True)
Endpoints
Censor Offensive Words (better_profanity)
URL:/censor
Method: POST
Request Body (JSON):
{
"text": "This is some bad and offensive text."
}
Request Body (JSON):
{
"original_text": "This is some bad and offensive text.",
"censored_text": "This is some **** and ******** text."
}
Detect Offensive Text (profanity-check)
URL: /detect
Method: POST
Request Body (JSON):
{
"text": "This is some offensive text."
}
Request Body (JSON):
{
"text": "This is some offensive text.",
"is_offensive": true,
"offensive_probability": 0.85
}
Content Censorship System in Node.js
In Node.js, censoring text involves identifying and replacing offensive or inappropriate words with a placeholder (e.g., *
). There are several approaches to implement text censorship, including using word lists, regular expressions, and third-party libraries.
Here’s a breakdown of the methods for censoring text in Node.js:
1. Manual Censorship with Word Lists
One of the simplest methods is to maintain a list of offensive words and replace them with symbols like *
. This can be done by using basic string replacement or regular expressions.
function censorText(text, badWords) {
badWords.forEach(word => {
const regex = new RegExp(`\\b${word}\\b`, 'gi');
text = text.replace(regex, '*'.repeat(word.length));
});
return text;
}
const badWords = ['bad', 'offensive', 'ugly'];
const inputText = 'This is a bad and offensive sentence.';
const censoredText = censorText(inputText, badWords);
console.log(censoredText);
Output:
This is a *** and ******** sentence.
2. Using a Third-Party Library (bad-words)
You can use a library like bad-words to censor offensive words in a text. This library provides a list of common offensive words and allows you to easily censor them.
Installation:
npm install bad-words
Example:
const Filter = require('bad-words');
const filter = new Filter();
const text = 'This is a bad example of offensive content.';
const censoredText = filter.clean(text);
console.log(censoredText);
Output:
This is a *** example of ****** content.
You can also customize the list of words the filter uses by adding your own:
filter.addWords('bad', 'offensive', 'ugly');
3. Using Regular Expressions
For more flexibility, you can use regular expressions in Node.js to match and replace offensive words.
function censorWithRegex(text, badWords) {
badWords.forEach(word => {
const regex = new RegExp(`\\b${word}\\b`, 'gi');
text = text.replace(regex, '*'.repeat(word.length));
});
return text;
}
const badWords = ['bad', 'ugly'];
const text = 'This is a bad example of ugly content.';
const censoredText = censorWithRegex(text, badWords);
console.log(censoredText);
Output:
This is a *** example of *** content.
4. Using profanity-check
in Node.js
There is a Node.js package called profanity-check
that uses machine learning to detect offensive content. You can use it to identify offensive text and respond accordingly.
Installation:
npm install profanity-check
Example:
const { check } = require('profanity-check');
const text = 'This is a bad sentence.';
const isOffensive = check(text);
if (isOffensive) {
console.log('This text is offensive');
} else {
console.log('This text is not offensive');
}
Output:
This text is offensive
5. Express.js API for Censorship
You can build an API in Node.js using express
to allow users to send text and get censored responses.
Install the necessary packages:
npm install express bad-words
Create the server:
const express = require('express');
const Filter = require('bad-words');
const app = express();
const filter = new Filter();
app.use(express.json());
app.post('/censor', (req, res) => {
const { text } = req.body;
const censoredText = filter.clean(text);
res.json({ originalText: text, censoredText });
});
app.listen(3000, () => {
console.log('Server is running on port 3000');
});
Test the API
You can use Postman or curl
to test the API. Send a POST
request with the JSON body containing the text
field.
Response:
{
"originalText": "This is a bad sentence",
"censoredText": "This is a *** sentence"
}
Leave a Reply