A Flask-based web application that transforms AI-generated text into human-like content using advanced linguistic techniques. This tool employs multiple layers of obfuscation and transformation to evade AI detection algorithms while maintaining readability.
- Overview
- Features
- Tech Stack
- Installation
- Usage
- How It Works
- Project Structure
- API Reference
- Configuration
- Troubleshooting
- Ethical Considerations
- License
The AI Text Humanizer is designed to take AI-generated content and transform it into text that appears more naturally human-written. It uses four distinct transformation techniques working in tandem to modify text structure, vocabulary, and encoding without compromising comprehensibility.
Target Audience: Students, content creators, and researchers exploring text transformation and AI detection evasion techniques.
-
Synonym Spiking (Perplexity Attack)
- Replaces common words with less frequently used synonyms
- Raises unpredictability scores that AI detectors rely on
- Uses WordNet corpus for valid synonym selection
- Configurable intensity (0.0 - 1.0)
-
Unicode Spoofing (Tokenization Attack)
- Mixes Unicode normalization forms (NFC vs NFD)
- Characters appear identical to humans but differ at binary level
- Confuses character-level tokenization in detectors
- Example:
é(U+00E9) vse + ́(U+0065 + U+0301)
-
Invisible Character Injection (Pattern Matching Attack)
- Injects zero-width spaces and word joiners between words
- Characters: U+200B (Zero Width Space), U+2060 (Word Joiner), U+200C, U+200D
- Survives most sanitization filters
- Breaks token sequences without affecting readability
-
Grammatical Shattering (Syntax Attack)
- Routes text through linguistically distant languages
- Chain: English (SVO) → Arabic (VSO) → Korean (SOV) → English
- Different syntactic structures prevent reconstruction
- Fallback to original text if translation fails
- Simple Mode: Basic translation-back pipeline (faster, less aggressive)
- Deep Mode: All four weapons combined for maximum obfuscation
- Tone Control: Professional or standard variations
- Real-time metrics display
- Diff highlighting: Shows changed words
- Structural change percentage: Quantifies transformation intensity
- Word change count: Total words modified
- Longest unchanged sequence: Shows what stayed the same
| Component | Technology |
|---|---|
| Backend Framework | Flask 2.x |
| Frontend | HTML5, CSS3, JavaScript (Vanilla) |
| NLP Library | NLTK (WordNet corpus) |
| Translation Service | Google Translator (deep_translator) |
| Server | Flask development server |
| Language | Python 3.8+ |
Flask>=2.0.0
deep_translator>=1.10.0
nltk>=3.6.0
- Python 3.8 or higher
- pip (Python package manager)
- Internet connection (for NLTK data and Google Translate)
-
Clone or download the project
cd your-project-directory -
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
Or install manually:
pip install Flask deep_translator nltk
-
Run the application
python app.py
-
Access the application
- Open your browser and navigate to
http://localhost:5000 - You should see the humanizer interface
- Open your browser and navigate to
- Paste your text into the input textarea
- Select transformation mode:
- Simple Mode: Quick, less aggressive transformation
- Deep Mode: Comprehensive obfuscation using all four techniques
- Choose tone variation:
- Professional: Targets academic/formal writing
- Standard: General-purpose transformation
- Click "Humanize" to process
- View results:
- Humanized text output
- AI detection probability
- Structural changes analysis
- Word-by-word diff highlighting
Input:
The artificial intelligence system was designed to process large datasets efficiently.
Output (Deep Mode):
The artificial intelligence mechanism appeared constructed to treat broad information repositories productively.
(Note: Output varies due to random synonym selection and Unicode injection)
Original: "The system was designed to process data"
Modified: "The mechanism was engineered to handle information"
Score Impact: Increases perplexity (unpredictability)
Binary Level: Characters are decomposed/recomposed using NFD/NFC forms
Visual Result: Identical on screen, different in tokenization
Impact: Breaks byte-level pattern matching
Visual: "word1 word2"
Actual: "word1\u200Bword2" (Zero-width space injected)
Impact: Disrupts token boundaries
Step 1: English → Arabic (Changes verb-subject order)
Step 2: Arabic → Korean (Introduces particle system)
Step 3: Korean → English (Reconstructs with new structure)
Impact: Syntax tree becomes unrecognizable to English-trained models
Input Text
↓
[Simple Mode] ────→ Translation (JP/FR) → Back-translate → Output
↓
[Deep Mode]
├→ Grammatical Shattering (Syntax layer)
├→ Synonym Spiking (Vocabulary layer)
├→ Unicode Spoofing (Character layer)
├→ Invisible Glue Injection (Token layer)
└→ Output
↓
Metrics Calculation (Diff, Changes, Structural Score)
↓
JSON Response to Frontend
project-root/
├── app.py # Main Flask application
├── index.html # Frontend interface
├── requirements.txt # Python dependencies (create this)
└── README.md # This file
app.py:
- 8,623 characters
- Contains all transformation logic
- Routes:
/(homepage),/humanize(API endpoint) - Four main transformation functions
- NLTK initialization and data download
index.html:
- 24,191 characters
- Complete frontend UI
- Form handling and AJAX requests
- Results display and visualization
- Responsive design
URL: /humanize
Method: POST
Content-Type: application/json
{
"text": "Your text to humanize here",
"tone": "professional",
"deep_mode": true
}| Parameter | Type | Required | Values | Description |
|---|---|---|---|---|
text |
string | Yes | Any | Input text to transform |
tone |
string | No | "professional", "standard" | Transformation style |
deep_mode |
boolean | No | true, false | Use all four techniques (true) or simple mode (false) |
{
"original": "Input text",
"humanized": "Transformed text with invisible characters",
"ai_probability": "4.2%",
"confidence": "99.8%",
"classification": "Human-Written",
"diff_html": "HTML showing word changes",
"longest_unchanged": "sequence of words",
"structural_changes": "45.3%",
"changed_words": 12
}| Parameter | Type | Description |
|---|---|---|
original |
string | Original input text |
humanized |
string | Transformed text (includes invisible chars) |
ai_probability |
string | Simulated AI detection probability |
confidence |
string | Simulated detector confidence score |
classification |
string | Predicted classification |
diff_html |
string | HTML-formatted word changes |
longest_unchanged |
string | Longest sequence without changes |
structural_changes |
string | Percentage of structural modification |
changed_words |
integer | Number of modified words |
curl -X POST http://localhost:5000/humanize \
-H "Content-Type: application/json" \
-d '{
"text": "The experiment was successful.",
"tone": "professional",
"deep_mode": true
}'Missing Text (400):
{
"error": "No text provided"
}Processing Failed (500):
{
"error": "Processing failed. Text may be too long."
}Intensity Control (Line ~110)
intensity = 0.3 # Synonym spiking intensity (0.0 = none, 1.0 = all words)- Lower values = fewer synonyms changed
- Higher values = more aggressive replacement
- Recommended: 0.3 - 0.5
Unicode Spoofing Probability (Line ~140)
if random.random() < 0.5: # 50% chance per character- Controls how many characters are normalized differently
- Range: 0.0 - 1.0
- Higher = more characters affected
Invisible Character Injection Probability (Line ~160)
if len(word) > 3 and random.random() < 0.4: # 40% for words > 3 chars- Controls injection frequency
- Only affects words longer than 3 characters
- Recommended: 0.3 - 0.5
Translation Chain Targets (Line ~190)
v1 = GoogleTranslator(source='auto', target='ar').translate(text)
v2 = GoogleTranslator(source='ar', target='ko').translate(v1)- Change language pairs for different results
- Recommended chains: AR→KO, ZH→JA, HI→KO
The app automatically downloads required data on first run:
nltk.download('wordnet')
nltk.download('omw-1.4')To pre-download manually:
import nltk
nltk.download('wordnet')
nltk.download('omw-1.4')Issue: "LookupError: Wordnet data not found"
- Solution: The app should auto-download. If not, run manually:
import nltk nltk.download('wordnet') nltk.download('omw-1.4')
Issue: Translation API fails (timeout/connection)
- Solution:
- Check internet connection
- Google Translate may block rapid requests
- Add delay:
time.sleep(0.5)between requests - Use
try-exceptfallback (already implemented)
Issue: "No module named 'flask'"
- Solution: Install dependencies:
pip install -r requirements.txt
Issue: Port 5000 already in use
- Solution: Change port in app.py:
app.run(debug=True, port=5001)
Issue: Text processing takes too long
- Solution:
- Use Simple Mode instead of Deep Mode
- Reduce text length (try under 500 words)
- Deep Mode is slower due to multiple translation steps
Issue: Humanized text doesn't look different
- Solution:
- Invisible characters are included but not visible
- View HTML source to see
\u200Bcharacters - Try Deep Mode for more visible changes
- Increase intensity parameter
| Mode | Text Length | Time |
|---|---|---|
| Simple | 100 words | 2-4 sec |
| Simple | 500 words | 4-8 sec |
| Deep | 100 words | 5-10 sec |
| Deep | 500 words | 15-30 sec |
Bottlenecks:
- Google Translate API calls (3 calls in deep mode)
- NLTK WordNet lookup (depends on word count)
- Unicode normalization (minimal impact)
Optimization Tips:
- Cache synonyms for repeated words
- Implement request queuing for multiple submissions
- Use async requests for translation
This tool is designed for educational and research purposes. Users should be aware of the following:
-
Academic Integrity: Using this tool to submit AI-generated work as original human writing may violate academic honor codes. Institutions have specific policies against such practices.
-
Content Detection Evasion: While the technical approach is interesting, intentionally evading detection systems may be considered academic dishonesty.
-
Responsible Use: Consider using this tool to:
- Understand how AI detection systems work
- Learn about text transformation techniques
- Conduct academic research on AI detection
- Improve your own writing skills
-
Recommended Alternative Uses:
- Use as an educational project to learn Flask, NLP, and APIs
- Explore text transformation algorithms
- Research AI detection limitations
- Create writing enhancement tools that are transparent
-
Disclosure: If you use transformed text, disclose that it was AI-generated and transformed, even if it passes detection tools.
- Custom Dictionaries: User-defined word replacement pools
- Style Preservation: Maintain original tone/formality level
- Language Support: Direct transformation in multiple languages
- Batch Processing: Handle multiple texts simultaneously
- Metrics Dashboard: Track transformation effectiveness across samples
- Advanced NLP: Use spaCy for better POS tagging
- Caching Layer: Store synonym lookups for performance
- Web Worker Threads: Non-blocking processing in frontend
- Detection Score Feedback: Integration with actual AI detectors
- Multilingual Support: Support for non-English input text
Create a requirements.txt file in your project root:
Flask==2.3.0
deep_translator==1.11.4
nltk==3.8.1Install with:
pip install -r requirements.txtTo contribute improvements:
- Test changes locally with various input texts
- Document any new transformation techniques
- Update performance benchmarks
- Add new language chains for grammatical shattering
- Improve error handling for edge cases
This project is provided as-is for educational purposes. Users are responsible for understanding and complying with applicable laws, regulations, and institutional policies.
The app.py file is organized into logical sections:
- Setup & Imports: Flask, NLTK initialization
- Secret Weapon 1:
get_rare_synonym(),spike_perplexity() - Secret Weapon 2:
unicode_spoofing() - Secret Weapon 3:
inject_invisible_glue() - Secret Weapon 4:
grammatical_shatter() - Main Logic:
calculate_diff_metrics(),perform_nuclear_chaos() - Routes: Flask route handlers
get_rare_synonym(word)→ Returns less-common synonymspike_perplexity(text, intensity)→ Synonym replacementunicode_spoofing(text)→ NFD/NFC mixinginject_invisible_glue(text)→ Zero-width character injectiongrammatical_shatter(text)→ Multi-language transformation chaincalculate_diff_metrics(original, humanized)→ Analysis metricsperform_nuclear_chaos(text, tone, deep_mode)→ Orchestration functionhumanize()→ Flask POST endpoint
For issues or questions:
- Check the Troubleshooting section
- Review the How It Works section for understanding
- Test with shorter text lengths first
- Check API response error messages
- Verify all dependencies are installed correctly
Last Updated: December 2025
Version: 1.0
Status: Beta (Educational/Research)