Basant

Rising Codementor

US$15.00

For every 15 mins

ABOUT ME

Senior Big data and AI Engineer

Senior data & AI engineer with 9+ years building and operating scalable, low-latency data platforms processing 15TB+ and hundreds of millions of events daily across AWS, GCP, and Azure. Open-source-first — Kafka, Spark, Apache Iceberg, Trino, dbt, and Airflow on Kubernetes — across ingestion, transformation, and BI/reverse-ETL consumption, with the governance, observability, and reliability (SLOs, 99.9% uptime) production data products demand. Also delivers production AI/agentic systems (LangChain, CrewAI, Google-ADK, RAG) and administers the databases behind them — PostgreSQL, MongoDB, Cassandra/ScyllaDB, Redis, MySQL, ClickHouse (HA, backup/DR, tuning, security). Track record: 40% lower cost, 50% faster processing, 60% faster analytics deliver

English

Eastern Time (US & Canada) (-04:00)

Joined May 2026

EXPERTISE

9 years experience

9 years experience

7 years experience

7 years experience

RAG

3 years experience

Agentic frameworks

3 years experience

Kubernetes

7 years experience

REVIEWS FROM CLIENTS

Basant's profile has been carefully vetted and approved as a Codementor. Connect with Basant now, and leave a review for them once you're done!

EMPLOYMENTS

Senior Data Engineer

UXCam

2024-02-01-Present

Owned the App Analytics Agent Platform end-to-end: architecture on Google-ADK, LangChain, and CrewAI with MCP servers for agent-to-age...

Owned the App Analytics Agent Platform end-to-end: architecture on Google-ADK, LangChain, and CrewAI with MCP servers for agent-to-agent communication and LangGraph orchestration. Platform runs autonomously across 500+ mobile apps, cutting analytics delivery time by 60% and analyst manual effort by 75%

Addressed a video analysis bottleneck by building RAG pipelines on Milvus and integrating LLM-based processing for 1M+ videos/month, resulting in 20% higher engagement insights. Separately deployed ReAct agents for funnel analysis and churn prediction

Established production observability for all LLM agents through LangSmith, LangFuse, and Opik, enabling full trace visibility into agent decisions and prompt-level regression detection

Manage Databricks + Unity Catalog infrastructure handling 10TB+ daily, including the data governance layer, lineage tracking, and schema evolution

Drove MLOps adoption with MLflow: recommendation system increased personalization by 25%. Also maintain serverless microservices processing 1M+ events/second on Lambda

Collaborate with product managers, data scientists, and business stakeholders to define data strategy, prioritize pipeline work, and ensure data security and access control standards are met

Mentor 5 engineers through structured code reviews and cross-team pairing sessions between data engineering and ML. Team delivery times decreased 30%

Lambda

React

Databricks

Lambda

React

Databricks

MLflow

Langchain

LLM

Crewai

Langgraph

RAG

Mcmc

Langsmith

Milvius vector db

Opir

Langfuse

Data Engineer

UXCam

2020-02-01-2024-02-01

Rebuilt the core ETL layer in Spark and PySpark from the ground up. Processing time for 5TB+/day of mobile analytics data dropped 50%,...

Rebuilt the core ETL layer in Spark and PySpark from the ground up. Processing time for 5TB+/day of mobile analytics data dropped 50%, which unblocked the product team on real-time dashboards

Tuned distributed databases across Trino, DuckDB, CitusData, TimescaleDB, and ClickHouse serving 100M+ queries/day. Query optimization and repartitioning delivered a 50% speed improvement

Led Kubernetes migration of the full data infrastructure. End result: 40% cost reduction, 99.9% uptime, and auto-scaling that held under peak load

Designed the Iceberg-based Lakehouse with Unity Catalog and SCD patterns, now handling petabyte-scale data at 35% lower storage cost

Built Kafka + Kinesis streaming pipelines feeding real-time analytics for 500+ apps, alongside a data quality framework maintaining 99.5% accuracy

Owned data modeling for analytics use cases and automated pipeline orchestration using Airflow, reducing manual intervention and improving scheduling reliability

Operated all production workloads on AWS (Lambda, EC2, RDS, EKS, S3, Redshift, Glue, Athena) with monitoring and alerting configured from initial deployment

Data

Amazon EC2

Lambda

Data

Amazon EC2

Lambda

Amazon RDS

Apache Spark

Amazon Redshift

Apache Kafka

Kubernetes

Airflow

Aws athena

ClickHouse

Duck Creek

Trino

Unity

AWS

Iceberg

Gluestack

Spark optimization

Csdm

Timeseries databases

Project Leader

SVCET

2019-01-01-2020-01-01

Led a team of 4 developers on an Intelligent Dialogue System. NLP + deep learning approach achieved 90% intent recognition, meeting th...

Led a team of 4 developers on an Intelligent Dialogue System. NLP + deep learning approach achieved 90% intent recognition, meeting the client acceptance threshold

Coordinated requirements gathering, sprint planning, and delivery milestones across academic and industry stakeholders. Project completed two weeks ahead of schedule

NLP

Deep Learning

NLP

Deep Learning

PROJECTS

ClickHomes AI — Real Estate Analytics and Lead Generation PlatformView Project

2026

ClickHomes AI is a production real estate analytics and lead generation platform built for Milton and the Greater Toronto Area. I archite...

ClickHomes AI is a production real estate analytics and lead generation platform built for Milton and the Greater Toronto Area. I architected a dual-database system using PostgreSQL for operational data and ClickHouse for OLAP analytics, processing 9,500+ MLS listings through a RESO-compliant ETL pipeline (Connector, RESOTransformer, and GoldDataLoader). The platform delivers hyper-local market intelligence at the postal code level, a side-by-side property comparison engine, a geo-farming toolkit with street-level analytics and QR-based door-hanger campaigns, and a full seller acquisition funnel with multi-factor lead scoring, email drip campaigns, and appointment booking. An AI chat agent built on LangGraph ReAct architecture provides filter-first property search with streaming responses. A builder inventory crawler with SHA-256 change detection feeds a New Homes Hotlist with real-time buyer alert notifications. I built the platform end to end as the sole architect and engineer.

Python

SQL

PostgreSQL

Python

SQL

PostgreSQL

Redis

TypeScript

Celery

Docker

Next.js

Apache Airflow

Tailwind css

ClickHouse

Fastapi

Playwright

Pydantic

Langchain