Skip to content

A Python-based framework demonstrating data anonymization methods for securing user location data in modern Location-Based Services.

Notifications You must be signed in to change notification settings

AnanyaGubba/GeoShield

Repository files navigation

Improving Privacy in Location-Based Services through Data Anonymization Techniques

Overview

This repository contains a comprehensive implementation of privacy-preserving techniques for Location-Based Services (LBS), including:

  • k-Anonymity: Ensures location indistinguishability among k users
  • Spatial Cloaking: Reduces location precision through grid-based obfuscation
  • Geo-Indistinguishability: Applies differential privacy with Laplace noise

Quick Start

Prerequisites

Python 3.8 or higher pip (Python package manager)

Requirements.txt

numpy>=1.21.0
pandas>=1.3.0
matplotlib>=3.4.0
seaborn>=0.11.0
scipy>=1.7.0
flask>=2.0.0

Installation

  1. Clone or download the repository

  2. Install required packages:

    pip install -r requirements.txt

Project Structure

privacy-lbs/
│
├── privacy_lbs_main.py              # Main implementation with algorithms
├── lbs_server_client.py             # Server-client simulation
├── advanced_analysis.py             # Attack simulation & analysis
├── requirements.txt                 # Python dependencies
├── README.md                        # This file
│
└── outputs/                         # Generated outputs
    ├── location_dataset.csv
    ├── analysis_results             # all the results obtained are collected in this folder

Usage

1. Run Main Implementation

This generates synthetic data, applies all three anonymization techniques, and produces evaluation metrics:

python privacy_lbs_main.py

Outputs:

  1. location_dataset.csv - Synthetic location data (500K+ queries)
  2. privacy_metrics_comparison.png - Privacy metrics visualization
  3. service_quality_metrics.png - Service quality comparison
  4. location_distributions.png - Visual comparison of original vs anonymized

2. Run Server-Client Simulation

Simulates the three-tier architecture (Client → Anonymizer → LBS Provider):

python lbs_server_client.py

Outputs:

  • Console output showing real-time processing

3. Run Advanced Analysis

Performs attack simulations and comparative analysis:

python advanced_analysis.py

Outputs:

  1. technique_comparison.csv - Comparative metrics
  2. technique_radar_comparison.png - Radar chart comparison
  3. privacy_utility_tradeoff.png - Tradeoff analysis
  4. performance_dashboard.png - Comprehensive dashboard
  5. Heatmaps for location density

Understanding the Outputs

1. Privacy Metrics

  • Re-identification Risk: Probability of linking anonymized data to individuals (lower is better)
  • Information Loss: Average distance between original and anonymized locations (meters)
  • Anonymity Level: Size of anonymity set or privacy budget value

2. Service Quality Metrics

  • Precision: Percentage of accurate service responses
  • Response Time: End-to-end latency in milliseconds
  • Result Count: Average number of POIs returned per query

3. Visualizations

  • Location Distributions: Shows spatial impact of anonymization
  • Privacy Metrics Comparison: Bar charts comparing techniques
  • Service Quality: Performance metrics visualization
  • Radar Chart: Multi-dimensional technique comparison
  • Tradeoff Analysis: Privacy vs. utility scatter plot
  • Heatmaps: Query density before/after anonymization
  • Performance Dashboard: Comprehensive 9-panel overview

Privacy Attack Simulations

The advanced_analysis.py script simulates three types of attacks:

  1. Linkage Attack: Attempts to match anonymized locations to originals
  2. Trajectory Tracking: Tries to reconstruct user movement patterns
  3. Inference Attack: Attempts to infer sensitive attributes from locations

Best Practices

For Developers

  1. Use Hybrid Approach: Combine multiple techniques for defense-in-depth
  2. Adaptive Privacy: Adjust parameters based on user density and context
  3. User Control: Let users choose their privacy level
  4. Regular Audits: Monitor and update algorithms against new attacks

For Researchers

  1. Benchmark Against Baseline: Always compare with non-anonymized performance
  2. Multiple Metrics: Evaluate both privacy and utility
  3. Real-World Scenarios: Test with diverse geographic distributions
  4. Attack Simulations: Validate against known privacy attacks

Academic Context

This implementation accompanies the case study: "Privacy in Location-Based Systems: A Comprehensive Case Study" Key Findings:

  • 27-fold reduction in re-identification risk (87% → 3.2%)
  • Maintained service quality within acceptable thresholds (<7% precision loss)
  • Demonstrated feasibility of privacy-preserving LBS at scale
  • Validated trade-offs between privacy protection and service utility

Recommendations:

  • Implement privacy-by-design principles
  • Develop industry-wide standardization
  • Invest in user education
  • Regular privacy audits and updates
  • Collaboration with policymakers

About

A Python-based framework demonstrating data anonymization methods for securing user location data in modern Location-Based Services.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages