Streaming and event-driven pipelines
Data Engineer
Building pipelines, streaming systems, and analytics backends for real downstream use.
I build developer-grade data systems across streaming, ETL, analytics APIs, and lakehouse exploration, with an emphasis on clean architecture and usable downstream outputs.
Streaming, ETL, APIs, lakehouse exploration
Practical systems thinking over decorative dashboards
Show engineering depth with a stronger developer-facing presentation
About
A data engineer focused on the systems behind reliable products.
I focus on reliable pipelines, practical system design, and turning raw data into dependable interfaces for analytics and product teams.
I am a data engineer based in Bengaluru, India, currently at Moodys Ratings. I enjoy building reliable flows from source to storage to consumption, especially when the architecture is as interesting as the output.
I would rather show shipped pipelines, real project depth, and the technical direction I am growing into than fill the page with generic portfolio noise.
Spark, PySpark, Airflow, and SQL workflows
Warehouse and lakehouse-oriented thinking
Backend-minded data products for downstream consumption
Experience
Data engineering experience across banking and ratings.
4+ years total experience, shown as a compact progression without overpowering the projects.
Data Engineer
Moody's Ratings · 2025 - Present
Progressed into building data systems, pipeline reliability, and analytics-facing engineering work with a stronger platform and developer mindset.
Business Analyst
Axis Bank · 2022 - 2025
Started on the analytics side of the stack, working with reporting, data quality, stakeholder requirements, and problem framing that later translated well into data engineering.
Projects
Selected builds across streaming, orchestration, APIs, analytics, and lakehouse exploration.
The Iceberg proof of concept is explicitly shown as work in progress.
TransitFlow Realtime Event Stream
A real-time transit-data pipeline that simulates events, streams through Kafka, processes with Spark, and lands analytics-ready datasets in AWS.
- Combines Kafka, Spark, Glue, Athena, and Redshift in one architecture
- Frames data engineering as an end-to-end flow from ingestion to query
- Strongest systems-style project for the portfolio hero section
Reddit Sentiment ETL Pipeline
An orchestrated ETL workflow that extracts Reddit data, transforms it with PySpark sentiment analysis, and loads it for analytics and dashboarding.
- Uses Airflow for orchestration and PySpark for transformation
- Separates raw and transformed storage across MySQL and PostgreSQL
- Connects engineering work to downstream BI consumption
Analytics API
A lightweight analytics-serving layer designed to expose data stored in a time-series oriented backend for downstream use.
- Shows an API mindset beyond pipeline construction
- Useful to signal consumption patterns, not just storage
- Good bridge between engineering and analytics enablement
Realtime Voting System
A streaming aggregation project that processes live voting events and surfaces real-time counts through a dashboard-oriented flow.
- Demonstrates real-time processing patterns with PySpark streaming
- Adds another event-driven project to support the portfolio narrative
- Pairs well with TransitFlow as a second streaming proof point
WhatsApp Chat Analyzer
A deployed analytics app that transforms exported chats into interactive insights, giving the portfolio a user-facing project alongside systems work.
- Shows you can package analysis into a usable interface
- Adds visible demo value alongside backend-heavy projects
- Useful as a lighter, more approachable project in the lineup
Iceberg Lakehouse POC
An exploratory lakehouse proof of concept focused on Apache Iceberg and modern table-format thinking for scalable analytical data systems.
- Explicitly positioned as ongoing work rather than a finished build
- Signals active learning in lakehouse architecture and open table formats
- Should visually read as experimental and forward-looking
Stack
The tools behind ingestion, processing, orchestration, storage, and serving.
Ingestion
Processing
Orchestration
Storage
Analytics
Contact
Open to interesting data engineering work and technical collaboration.
GitHub is the best place to inspect the actual project depth. Use the repo links above, then reach out if there is a fit.