Building a Smart Content Censorship System : A Step-by-Step Guide

Back to Blog
Censored Profanity Check - Ezeelive Technologies

Building a Smart Content Censorship System : A Step-by-Step Guide

Content Censorship System in Python

Censoring text in Python refers to the process of identifying and replacing offensive, inappropriate, or sensitive words or phrases in a text. This is often done to ensure that the content is suitable for various audiences, especially in applications like social media, forums, or chat platforms.

There are several ways to censor text in Python, including:

  1. Using pre-defined word lists.
  2. Using libraries that provide censorship functionality.
  3. Machine learning models to detect offensive language.

If you’re looking for a Python package that handles text censorship, you can use profanity-check or better_profanity. These packages are designed to detect and censor profane or offensive language.

1. Using better_profanity

Installation:
pip install better_profanity
Usage:

from better_profanity import profanity

# Load a custom list of words (optional)
profanity.load_censor_words(["bad", "offensive"])

# Censor text
text = "This is some bad and offensive text."
censored_text = profanity.censor(text)

print("Original text:", text)
print("Censored text:", censored_text)
Output:

Original text: This is some bad and offensive text.
Censored text: This is some **** and ******** text.

2. Using profanity-check

Installation
pip install profanity-check
Usage

from profanity_check import predict, predict_prob

text = "This is a bad text."
texts = ["This is fine.", "This is offensive."]

# Predict if a text is offensive (1 = offensive, 0 = not offensive)
print(predict([text]))

# Predict the probability of being offensive
print(predict_prob(texts))

profanity-check is more suited for detecting profane or offensive language, rather than replacing it with censored text.

Which One to Use?

  • Use better_profanity if you want a simple and customizable censoring tool.
  • Use profanity-check if you need a machine-learning-based approach to detect offensive content.

Here’s an example of how to create a Flask API that uses both better_profanity for censoring text and profanity-check for detecting offensive language in content censorship system.

Install the required libraries

pip install flask better_profanity profanity-check

Flask API Code


from flask import Flask, request, jsonify
from better_profanity import profanity
from profanity_check import predict, predict_prob

app = Flask(__name__)

# Load custom censor words for better_profanity (optional)
profanity.load_censor_words(["bad", "offensive"])

@app.route("/censor", methods=["POST"])
def censor_text():
    """
    API endpoint to censor offensive words in a given text.
    """
    data = request.json
    text = data.get("text", "")
    censored_text = profanity.censor(text)
    return jsonify({"original_text": text, "censored_text": censored_text})

@app.route("/detect", methods=["POST"])
def detect_offensive():
    """
    API endpoint to detect if text contains offensive content.
    """
    data = request.json
    text = data.get("text", "")
    
    # Predict offensive probability
    is_offensive = predict([text])[0]  # 1 for offensive, 0 for not offensive
    offensive_prob = predict_prob([text])[0]
    
    return jsonify({
        "text": text,
        "is_offensive": bool(is_offensive),
        "offensive_probability": offensive_prob
    })

if __name__ == "__main__":
    app.run(debug=True)

Endpoints

Censor Offensive Words (better_profanity)

URL:/censor
Method: POST
Request Body (JSON):


{
  "text": "This is some bad and offensive text."
}

Request Body (JSON):


{
  "original_text": "This is some bad and offensive text.",
  "censored_text": "This is some **** and ******** text."
}
Detect Offensive Text (profanity-check)

URL: /detect
Method: POST
Request Body (JSON):


{
  "text": "This is some offensive text."
}

Request Body (JSON):


{
  "text": "This is some offensive text.",
  "is_offensive": true,
  "offensive_probability": 0.85
}

Content Censorship System in Node.js

In Node.js, censoring text involves identifying and replacing offensive or inappropriate words with a placeholder (e.g., *). There are several approaches to implement text censorship, including using word lists, regular expressions, and third-party libraries.

Here’s a breakdown of the methods for censoring text in Node.js:

1. Manual Censorship with Word Lists

One of the simplest methods is to maintain a list of offensive words and replace them with symbols like *. This can be done by using basic string replacement or regular expressions.


function censorText(text, badWords) {
    badWords.forEach(word => {
        const regex = new RegExp(`\\b${word}\\b`, 'gi');
        text = text.replace(regex, '*'.repeat(word.length));
    });
    return text;
}

const badWords = ['bad', 'offensive', 'ugly'];
const inputText = 'This is a bad and offensive sentence.';
const censoredText = censorText(inputText, badWords);

console.log(censoredText);

Output:

This is a *** and ******** sentence.

2. Using a Third-Party Library (bad-words)

You can use a library like bad-words to censor offensive words in a text. This library provides a list of common offensive words and allows you to easily censor them.

Installation:
npm install bad-words
Example:

const Filter = require('bad-words');
const filter = new Filter();

const text = 'This is a bad example of offensive content.';
const censoredText = filter.clean(text);

console.log(censoredText);
Output:
This is a *** example of ****** content.

You can also customize the list of words the filter uses by adding your own:

filter.addWords('bad', 'offensive', 'ugly');

3. Using Regular Expressions

For more flexibility, you can use regular expressions in Node.js to match and replace offensive words.


function censorWithRegex(text, badWords) {
    badWords.forEach(word => {
        const regex = new RegExp(`\\b${word}\\b`, 'gi');
        text = text.replace(regex, '*'.repeat(word.length));
    });
    return text;
}

const badWords = ['bad', 'ugly'];
const text = 'This is a bad example of ugly content.';
const censoredText = censorWithRegex(text, badWords);

console.log(censoredText);
Output:
This is a *** example of *** content.

4. Using profanity-check in Node.js

There is a Node.js package called profanity-check that uses machine learning to detect offensive content. You can use it to identify offensive text and respond accordingly.

Installation:
npm install profanity-check
Example:

const { check } = require('profanity-check');

const text = 'This is a bad sentence.';
const isOffensive = check(text);

if (isOffensive) {
    console.log('This text is offensive');
} else {
    console.log('This text is not offensive');
}
Output:
This text is offensive

5. Express.js API for Censorship

You can build an API in Node.js using express to allow users to send text and get censored responses.

Install the necessary packages:
npm install express bad-words
Create the server:

const express = require('express');
const Filter = require('bad-words');
const app = express();
const filter = new Filter();

app.use(express.json());

app.post('/censor', (req, res) => {
    const { text } = req.body;
    const censoredText = filter.clean(text);
    res.json({ originalText: text, censoredText });
});

app.listen(3000, () => {
    console.log('Server is running on port 3000');
});
Test the API

You can use Postman or curl to test the API. Send a POST request with the JSON body containing the text field.

Response:

{
  "originalText": "This is a bad sentence",
  "censoredText": "This is a *** sentence"
}

Share this post

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Blog