Naman Jain

Naman Jain

Mentor
Rising Codementor
US$35.00
For every 15 mins
ABOUT ME

Naman is a Data & Cloud Solutions Architect and has 8+ years of experience delivering data engineering services to multiple Fortune 100 clients, both on-site and remotely. He has taken dozens of clients' Scala Spark applications to the production environment, creating a step order of efficiency in their use-cases in many instances. He fundamentally believes in over-communication and establishing trust.

Eastern Time (US & Canada) (-04:00)
Joined April 2020
EXPERTISE
7 years experience
7 years experience
8 years experience
5 years experience
5 years experience
2 years experience
6 years experience

REVIEWS FROM CLIENTS

Naman's profile has been carefully vetted and approved as a Codementor. Connect with Naman now, and leave a review for them once you're done!
SOCIAL PRESENCE
GitHub
Mortgage-Market-Tri-Analysis
Imbalanced Class Classifier, Constrained Optimization, and Time Series Forecasting
Jupyter Notebook
4
3
BTC-Play-App
A Scala Play App that filteres and visualizes BTC data that's dynamically fetched via a Coinbase API
Scala
0
0
EMPLOYMENTS
Data Solutions Architect
Private Enterprise Client
2020-01-01-2021-04-01
- Worked on orchestration and automation of the workflows via Azure Data Factory. - Optimized and partitioned storage in Azure Data Lak...
- Worked on orchestration and automation of the workflows via Azure Data Factory. - Optimized and partitioned storage in Azure Data Lake Storage (ADLS) Gen2. - Implemented complex strongly-typed Scala Spark workloads in Azure Databricks, along with dependency management and Git integration. - Implemented real-time low cost and low latency streaming workflows which at their peak were processing >2MM raw JSON blobs per Second. Architected as Azure Blob Storage -> Azure Event Hubs -> Azure Queues via ABS-AQS. - Created a multi-layered ELT platform which consisted of raw/bronze (Azure Blob Storage), current and silver (Azure Delta Lake), and mapped/gold (Azure Delta Lake) layers. - Balanced the cost of computing by spinning up clusters on-demand vs persisting them. - Made big data available for efficient and real-time analysis throughout the client via delta tables, which provided indexed and optimized stores, ACID transaction guarantees, and table level and row-level access controls. - Tied all of this together in end-to-end workflows that were either refreshed with just a few clicks or automated as jobs. - Led a team of five comprising of four developers, and one solutions architect to productionalize big data workflows in Azure Cloud that enabled the client to sunset its legacy applications and experience far more reliable and scalable Prod workflows. - Enabled a wide diversity of use cases and future-proofed them by relying upon open source and open standards.
Scala
Azure
ETL
View more
Scala
Azure
ETL
Data Migration
Data Engineering
Databricks
Azure Data Lake
View more
Lead Data Engineer
Stealth mode AI startup (Series A $20 Million)
2019-05-01-2020-01-01
- Architected and implemented a distributed machine learning platform. - Productionized 20+ machine learning models via Spark MLlib. -...
- Architected and implemented a distributed machine learning platform. - Productionized 20+ machine learning models via Spark MLlib. - Built products and tools to reduce time to market (TTM) for machine learning projects. Reduced the startup's TTM from the design phase to production by 50%. - Productionalized 8 Scala Spark applications to transform the ETL layer to feed into the machine learning models downstream. - Used Spark SQL for ETL and Spark Structured Streaming and Spark MLlib for analytics. - Led a team of six comprising of three data scientists, two back-end engineers, and one front-end engineer. Delivered a solution that had a back-end layer that talked to the front end via REST API and launched and managed Spark jobs on demand.
Scala
Linux
Bash
View more
Scala
Linux
Bash
Apache Spark
Mllib
Data Engineering
View more
Senior Data Engineer
Dow Chemical (Fortune 62)
2018-02-01-2019-05-01
- Productionalized five Scala Spark apps for ETL. Wrote multiple Bash Scripts for the automation of these jobs. - Architected and produ...
- Productionalized five Scala Spark apps for ETL. Wrote multiple Bash Scripts for the automation of these jobs. - Architected and productionalized a Scala Spark app for validating the Oracle source tables with their ingested counterparts in HDFS. The user could dynamically choose to conduct either a high-level validation or a data level validation. The output of the app in case of a discrepancy was the exact columns and the exact rows that mismatched between source and destination. - Reduced the engineer's manual debug workload by over 99%, reducing it to just running the app and then reading the human-readable output file. - Delivered the entire ETL and validation project ahead of schedule.
SQL
Scala
Bash
View more
SQL
Scala
Bash
ETL
Apache Spark
Impala
Data Engineering
Apache Hive
View more