About Me
Data Engineer | Full-Stack Developer | Problem Solver
Hello! I'm Abhishek Kumar
I'm a Data Engineer and Full-Stack Developer based in Bangalore, passionate about building scalable data pipelines and real-time analytics systems that drive business value.
With expertise in modern data engineering tools and frameworks, I specialize in designing and implementing end-to-end data solutions—from ingestion and transformation to visualization and insights. I thrive at the intersection of data engineering and software development, creating robust systems that handle massive scale.
My work spans across diverse domains including AdTech, Travel & Hospitality, Financial Data Analytics, and Real-time Streaming Platforms. I'm particularly excited about building systems that process data at scale using technologies like Apache Spark, Databricks, Kafka, and modern data orchestration tools.
When I'm not architecting data pipelines or writing code, you'll find me exploring new technologies, contributing to open-source projects, or sharing my knowledge through technical writing.
What I Do
Data Engineering
Building scalable ETL/ELT pipelines, data warehouses, and streaming platforms
Real-Time Analytics
Streaming data processing with Kafka, Spark Streaming, and Delta Lake
Full-Stack Development
Django, FastAPI, React applications with modern architectures
Data Architecture
Designing scalable data platforms and cloud infrastructure (AWS, Azure)
Technical Skills
Data Engineering
Programming Languages
Web Development
Cloud & DevOps
Experience Highlights
AdTech Analytics Pipeline
Built a real-time data pipeline processing 100K URLs every 15 minutes using Prefect, PySpark, and Databricks. Implemented incremental processing with Delta Lake for efficient data updates.
Real-Time Streaming Platform
Architected and deployed a streaming analytics platform for crypto and stock data using Kafka, Spark Streaming, and WebSockets. Processes ~20GB of market data daily with sub-second latency.
CDC Data Warehouse
Designed and implemented Change Data Capture pipelines for a travel booking platform, synchronizing data across multiple sources with Databricks and Snowflake. Reduced data latency from hours to minutes.
Let's Work Together
I'm always interested in hearing about new projects and opportunities.