Madhav Kumar | Portfolio

Passionate
Data Professional

With 10 years of overall experience, I specialize in designing, developing, and executing massive data pipelines, data lakes, and scalable ingestion systems on the Azure Cloud Platform.

Professional Summary

Building Scalable Data Ecosystems

My approach blends a deep understanding of Big Data technologies with modern cloud architecture. Whether it's Snowflake optimization, Apache Spark processing, or automated CI/CD data pipelines, I bridge the gap between raw data and actionable business intelligence.

I am a results-driven Data Engineer with 10 years of expertise and 5 focused years designing and implementing scalable data ingestion pipelines using Azure Data Factory. Over the years, I've successfully executed data lake requirements in numerous large companies using the Big Data Technology stack (Python, Spark, Hadoop, Hive).

I am proficient in leveraging Azure Databricks and Spark for distributed processing, and adept at designing cloud-based data warehouse solutions using Snowflake on Azure. I work collaboratively with stakeholders to implement logical and physical data models, ensuring performance, scalability, and data integrity.

Snowflake Expert

Deep expertise in Multi-Cluster, Time Travel, cloning and performance optimization.

Big Data & Spark

Strong track record optimizing Spark jobs and distributed processing pipelines.

Streaming & Real-time

Real-time data architecture using Kafka and Spark Streaming.

CI/CD DevOps

Automating robust data pipeline deployments in Azure DevOps.

Work Experience

Azure Snowflake Data Engineer @ Walmart

Aug 2022 – Present

Designed and implemented scalable data ingestion pipelines using Azure Data Factory, ingesting data from SQL, CSV, and REST APIs.
Developed data processing workflows using Azure Databricks, leveraging Spark for distributed transformation tasks.
Designed a cloud-based data warehouse solution using Snowflake, creating schemas, tables, and views for efficient data retrieval.
Implemented partitioning, indexing, and caching strategies in Snowflake to reduce query latency.
Implemented real-time data processing solutions using Kafka and Spark Streaming for high-volume streaming data.
Developed automated CI/CD framework for data pipelines using Jenkins and Azure DevOps.

Azure Databricks ADF Snowflake Kafka Spark SQL

Azure Snowflake Data Engineer @ State Farm

Oct 2020 – Jul 2022 | Dallas, TX

Implemented end-to-end data pipelines using Azure Data Factory to extract, transform, and load (ETL) data into Snowflake.
Leveraged Azure Data Lake Storage logic for storing raw data, with robust partitioning and retention strategies.
Integrated Azure Logic Apps for orchestrating complex workflows and triggers.
Implemented advanced analytics/ML workflows using Azure Machine Learning and Snowflake.
Designed data archiving and retention strategies using Azure Blob Storage and Snowflake's Time Travel feature.
Optimized Spark configurations, caching, and data partitioning in Azure Databricks.

Azure Logic Apps Snowflake Time Travel PySpark Azure Purview

Big Data Developer @ Aetna Inc.

July 2019 – Sep 2020 | Hartford, CT

Maintained data pipelines using Sqoop, Flume, and Kafka to ingest and process customer behavioral data.
Performed data aggregation on large-scale datasets using Apache Spark, Scala, and Hive.
Integrated HBase with Hive on the Analytics Zone, optimizing tables for efficient queries.
Migrated data from RDBMS (Oracle) to Hadoop using Sqoop for processing.
Implemented automation for deployments using YAML scripts for faster releases.

Hadoop Scala Sqoop HBASE Kafka

Big Data Developer @ Anthem

April 2018 – June 2019 | Chicago, IL

Prepared an ETL framework using Sqoop, Pig, and Hive to bring data from various sources.
Developed Spark Streaming applications for real-time sales analytics.
Utilized Spark-Cassandra Connector APIs for data migration and reporting.
Extensively worked on creating combiners, partitioning, and distributed cache to enhance MapReduce job performance.

MapReduce Hive Cassandra Spark Streaming

Data Warehouse Developer @ Mayo Clinic

May 2015 – Mar 2018 | Rochester, MN

Designed ETL data flows using SSIS, to extract and migrate data from SQL Server, Access, and Excel.
Efficient in Dimensional Data Modeling for Data Mart design, developing fact & dimension tables with SCDs.
Built Cubes and Dimensions with different Architectures using SSAS for Business Intelligence and MDX scripting.
Expertise in developing Parameterized, Chart, Dashboard, and Scorecard reports natively via SSRS.

MS SQL Server SSIS SSAS SSRS

Data Warehouse Developer @ Allstate Insurance

Nov 2013 – Apr 2015 | Chicago, IL

Used Data warehouse for developing Data Mart feeding downstream reports in Power BI.
Deployed SSIS Packages and created SQL Agent jobs for efficient package execution.
Developed stored procedures and triggers to facilitate consistent data entry.
Shared data outside using Snowflake to quickly set up data sharing without complex pipelines.

Power BI C# SQL Profiler SharePoint

Technical Skills

Azure & Cloud Services

Azure Data Factory

Azure Databricks

Snowflake

Logic Apps

Function App

Azure DevOps

Azure Synapse

Big Data Technologies

MapReduce

Hive & Pig

PySpark & SparkSQL

Kafka

Spark Streaming

Oozie & Sqoop

Hortonworks / Cloudera

Languages & Databases

Python

Scala

SQL / PL-SQL

MS SQL Server

Oracle 11g/12c

Cosmos DB

ETL & Architecture

ETL Pipelines

SSIS / SSAS / SSRS

Data Warehousing

Dimensional Modeling

Data Marts

Change Data Capture

Academic Background

Master of Science - MS, Computer Science

Oregon State University

Sep 2023 - Mar 2025 | CGPA: 3.86

Skills: Algorithms, Machine Learning, Database Management (DBMS), Data Science

Bachelor of Technology - BTech, Computer Science

Amrita Vishwa Vidyapeetham

Jul 2017 - Jun 2021 | CGPA: 7.68

Skills: Data Structures, Operating Systems, Algorithms, Big Data Analytics

High School Diploma (XI - XII)

Sasi Junior College, Velivennu

Jun 2015 - Jul 2017

High School Diploma (I - X)

St. Ann's E.M School, Rajahmundry

Jun 2002 - May 2015

HELLO, I AM MADHAV KUMAR

Passionate
Data Professional

Professional Summary

Building Scalable Data Ecosystems

Work Experience

Azure Snowflake Data Engineer @ Walmart

Azure Snowflake Data Engineer @ State Farm

Big Data Developer @ Aetna Inc.

Big Data Developer @ Anthem

Data Warehouse Developer @ Mayo Clinic

Data Warehouse Developer @ Allstate Insurance

Technical Skills

Azure & Cloud Services

Big Data Technologies

Languages & Databases

ETL & Architecture

Licenses & Certifications

AWS Certified Cloud Practitioner

Introduction to Generative AI

JavaScript

SQL

Build Responsive Real-World Websites with HTML and CSS

HTML5 Application Development Fundamentals

Python Object Oriented Programming

Java Database Connection: JDBC & MySQL

Research Investigators and Key Personnel

Responsible Conduct of Research

Academic Background

Master of Science - MS, Computer Science

Bachelor of Technology - BTech, Computer Science

High School Diploma (XI - XII)

High School Diploma (I - X)

HELLO, I AM MADHAV KUMAR

Passionate Data Professional

Professional Summary

Building Scalable Data Ecosystems

Work Experience

Azure Snowflake Data Engineer @ Walmart

Azure Snowflake Data Engineer @ State Farm

Big Data Developer @ Aetna Inc.

Big Data Developer @ Anthem

Data Warehouse Developer @ Mayo Clinic

Data Warehouse Developer @ Allstate Insurance

Technical Skills

Azure & Cloud Services

Big Data Technologies

Languages & Databases

ETL & Architecture

Licenses & Certifications

AWS Certified Cloud Practitioner

Introduction to Generative AI

JavaScript

SQL

Build Responsive Real-World Websites with HTML and CSS

HTML5 Application Development Fundamentals

Python Object Oriented Programming

Java Database Connection: JDBC & MySQL

Research Investigators and Key Personnel

Responsible Conduct of Research

Academic Background

Master of Science - MS, Computer Science

Bachelor of Technology - BTech, Computer Science

High School Diploma (XI - XII)

High School Diploma (I - X)

Passionate
Data Professional