Skip to main content

Portfolio

Explore my collection of data engineering, web development, and machine learning projects showcasing real-world solutions.

Filtered Projects (3)

Data Warehousing

Travel Booking SCD2 Data Warehouse

SCD2 travel data warehouse with Delta Lake for analytic insights.

Delta Lake PySpark Python
DevOps & Cloud

Data Lake Architecture on AWS

Serverless data lake implementation on AWS using S3, Lambda, Glue, and Athena. Processes and stores ~20GB daily with automated partitioning and compression.

AWS PySpark Python Terraform
Data Engineering

Real-time Data Pipeline with Kafka & Spark

High-throughput streaming data pipeline processing millions of events per day using Apache Kafka and PySpark. Implements real-time aggregations and anomaly detection.

Apache Kafka Docker PostgreSQL PySpark Python