Phani Kumar Yadavilli

Phani Kumar Yadavilli

Mentor
Rising Codementor
US$25.00
For every 15 mins
ABOUT ME
Data Engineer | Mentor | Passionate about learning and teaching.
Data Engineer | Mentor | Passionate about learning and teaching.

Results-driven Data Engineer with experience in designing and developing
Terabytes to Petabytes scale Data Platforms/Data Lakes and ETL Frameworks for processing batch and real-time data.

Passion for Distributed Systems and Big Data ecosystem tools and technology stack. Strong acumen in choosing the right tools, technologies for building scalable architectures and platform solutions to process/analyze structured and unstructured datasets, supporting Engineers, Data Analysts, Data Scientists, and many customers. Ability to build production-grade Data Pipelines using the business requirements from scratch.

• 9 years of architecting, designing largely scalable Big Data platforms from Data Ingestion, Data Processing. Experience in building homegrown frameworks catering to Batch Processing and Real-Time/Streaming analytics.

• Built a Streaming Data Analytics platform to process ~500TBs volume of events to process telemetry data on ~80PBs of raw data per day. I designed Cisco's Data-Lake catering to the Cross-Functional teams as well as the external customers of Cisco.

• Instrumental in scaling the Cisco Syslog NG[Next Generation] platform from processing ~13Million events per day to ~55Million events per day. Syslog Next Generation is a highly scalable and high available event-driven and real-time Distributed Data Pipeline designed using Apache Kafka as the message bus and inter-process communication orchestrating different functional services developed using Java and Tomcat-based Spring Boot containers. The data is processed using Spark Streaming jobs.

• Saved 3400$ per quarter to Cisco Systems Inc. by designing, rewriting, and running the Pentaho Data Integration-based ETL pipelines to Microservices-based pipelines at scale.

• Led the Data Engineer efforts to build a scalable Batch Processing platform to process Mobility data for Cisco customers gathering the metrics calculating KPIs to analyze the Quality of Experience.

Pacific Time (US & Canada) (-07:00)
Joined July 2021
EXPERTISE
9 years experience
9 years experience
I have been using Java since the time I started my career. I have built large-scale distributed systems using Java.
I have been using Java since the time I started my career. I have built large-scale distributed systems using Java.
9 years experience
I have got extensive hands-on experience working with Apache Spark. I understand the internals of Spark and have built data pipelines sca...
I have got extensive hands-on experience working with Apache Spark. I understand the internals of Spark and have built data pipelines scaling the processing logic to process 55 millions of events in a day. I have fine tuned spark data pipelines from scratch. https://phanikumaryadavilli.medium.com/writing-udfs-user-defined-functions-in-apache-spark-4d263577b729 https://phanikumaryadavilli.medium.com/writing-tests-for-your-spark-code-using-funsuite-71a554f92106 https://phanikumaryadavilli.medium.com/avoiding-spark-to-read-and-generate-crc-and-success-files-c52b300c0a77
View more
View more
8 years experience
I have used Kafka as a distributed message queue as well as used a stream processing backend. I understand Kafka in depth. I have built m...
I have used Kafka as a distributed message queue as well as used a stream processing backend. I understand Kafka in depth. I have built many microservices using the event sourcing pattern where Kafka was used for Inter-Process Communication. https://phanikumaryadavilli.medium.com/parsing-apache-kafka-consumer-offsets-using-kafka-command-and-java-api-58880a62371d
View more
View more
5 years experience
I've deployed projects to various container options, including Docker and Tupperware (Facebook's own container format). We used the Docke...
I've deployed projects to various container options, including Docker and Tupperware (Facebook's own container format). We used the Docker format for deploying Spring boot applications and orchestrated using Kubernetes.
View more
View more
8 years experience
Data Pipeline Design
9 years experience

REVIEWS FROM CLIENTS

Phani's profile has been carefully vetted and approved as a Codementor. Connect with Phani now, and leave a review for them once you're done!
EMPLOYMENTS
Lead Data Engineer
Cisco Systems Inc.
2019-01-01-Present
At Cisco, I am leading the Data Engineering efforts for the Contact Center Business Unit. • Designed the Data Ingestion Framework scalin...
At Cisco, I am leading the Data Engineering efforts for the Contact Center Business Unit. • Designed the Data Ingestion Framework scaling from 20TBs to 50TBs raw data per day. • Designed the Data Processing Framework to process 100TBs to 5PBs of data per day using Spark as the processing engine and HDFS as the storage. • Integrated the Data Ingestion and Data Processing frameworks using Apache Kafka. • Implemented best tooling and followed best practices for performance tuning. • Implemented a monitoring framework to monitor the Data Pipeline using homegrown frameworks and open source frameworks.
Java
Apache Spark
Apache Kafka
View more
Java
Apache Spark
Apache Kafka
Apache Airflow
Apache NiFi
Apache hbase
View more
PROJECTS
Streaming Data Pipeline to process telemetry data from devices.View Project
CapitalOne
2019
Large scale Streaming analytics data pipeline to process device data.
Large scale Streaming analytics data pipeline to process device data.
Java
Apache Spark
Apache Kafka
Java
Apache Spark
Apache Kafka