José Manuel Díaz Urraco jmdu99

Hi there 👋, I'm Jose

Freelance Data Engineer — Turning messy data into clarity for projects with real impact.
I work with purpose-driven teams to build data systems they can trust.

🛠 What I Do

📊 Centralise scattered data into a single source of truth
⚙️ Automate cleaning & validation for always-ready data
🚀 Design efficient ETL/ELT pipelines (Airflow, dbt, Spark…)
📈 Build solid foundations for BI, ML & GenAI
⏱ Create real-time dataflows when speed matters

💻 Tech Stack

Core Skills & Tooling

Python SQL Bash Git GitHub Poetry Pylint Pandas NumPy

Ingestion, Orchestration & Processing

Apache Airflow Cloud Composer (GCP) MWAA (AWS) dbt Fivetran Airbyte Prefect Apache Spark PySpark Apache Beam Dataflow (GCP) Dataproc (GCP) Spark Structured Streaming Apache Kafka Google Pub/Sub Apache NiFi Web scraping

Data Platforms & Storage

Amazon S3 Google Cloud Storage Parquet BigQuery Snowflake Amazon Redshift Amazon Athena PostgreSQL MongoDB Cassandra ClickHouse

Cloud & DevOps

Amazon EC2 Google Compute Engine Terraform (IaC) Docker Docker Compose GitHub Actions (CI/CD) IAM / RBAC

ML, NLP & Knowledge Graphs

Generative AI Large Language Models OpenAI API LangChain (RAG) Hugging Face Transformers NLTK spaCy scikit-learn PyTorch TensorFlow SPARQL AWS SageMaker

Analytics & Visualization

Matplotlib Seaborn Plotly Amazon QuickSight Apache Superset

🎯 About Me

Since 2021, I've worked in data across tech, banking, and large-scale systems (Amazon, Slido/Cisco).
In 2025, I went freelance to focus on projects with real impact — from healthtech and edtech to any sector that values purpose as much as results.
I also donate 10% of my earnings to the GiveWell Top Charities Fund.

🗂 Portfolio & Contact

💼 Portfolio request → LinkedIn
📩 Let’s connect and discuss how to make your data work better.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

José Manuel Díaz Urraco jmdu99

Achievements