Portfolio
Explore my collection of data engineering, web development, and machine learning projects showcasing real-world solutions.
Filtered Projects (3)
Travel Booking SCD2 Data Warehouse
SCD2 travel data warehouse with Delta Lake for analytic insights.
Delta Lake
PySpark
Python
Data Lake Architecture on AWS
Serverless data lake implementation on AWS using S3, Lambda, Glue, and Athena. Processes and stores ~20GB daily with automated partitioning and compression.
AWS
PySpark
Python
Terraform
156
45
View Details
Real-time Data Pipeline with Kafka & Spark
High-throughput streaming data pipeline processing millions of events per day using Apache Kafka and PySpark. Implements real-time aggregations and anomaly detection.
Apache Kafka
Docker
PostgreSQL
PySpark
Python
127
34
View Details