Thales Gibbon

Thales Gibbon

Mentor
Rising Codementor
US$15.00
For every 15 mins
ABOUT ME
Data Engineer | Cloud | 3x GCP Certified
Data Engineer | Cloud | 3x GCP Certified

Senior Data Engineer with 8+ years of experience building scalable data platforms and pipelines in both batch and real-time environments. Specialized in distributed data processing, cloud architecture, and machine learning infrastructure. Proven expertise in designing resilient systems using Apache Spark, Apache Beam, Airflow, and Terraform on GCP and AWS. Highly skilled in Python and SQL, with hands-on experience integrating data products with Data Warehouses, Data lakes and ML/AI models. Effective communicator with cross-functional teams, including data analysts, data scientists, DevOps, and business stakeholders.

Brasilia (-03:00)
Joined April 2024
EXPERTISE
8 years experience
8 years experience
3 years experience
4 years experience
1 year experience
Cloud Data Warehouse
8 years experience

REVIEWS FROM CLIENTS

Thales's profile has been carefully vetted and approved as a Codementor. Connect with Thales now, and leave a review for them once you're done!
SOCIAL PRESENCE
GitHub
prj-hack-delorean-12345
prj-hack-delorean-12345
Dart
1
2
dronepylot
Python
0
0
EMPLOYMENTS
Senior Data Engineer
Lumenalta
2025-05-01-Present
Principal Data Engineer
PicPay
2023-01-01-2025-04-01
• Led the development of a company-wide data platform, defining architecture, roadmap, and release priorities based on both business and ...
• Led the development of a company-wide data platform, defining architecture, roadmap, and release priorities based on both business and technical needs. • Served as the technical lead for multiple areas, including Data Ingestion, Machine Learning, Data Platform, Data Governance, and Analytics Engineering, supporting a 60-person cross-functional team. • Designed and implemented scalable real-time pipelines using Debezium, Kafka, and Apache Spark Streaming on Kubernetes, enabling low-latency decision-making. Standardized observability using Prometheus and Grafana. • Built a unified ingestion framework in Python applying object-oriented programming principles, reducing code duplication and operational incidents, while enabling ingestion from multiple data sources into the Data Lake. • Designed a solution on AWS for archiving, purging, and serving historical data, exposing services via Kubernetes to backend microservices, reducing MTTR and promoting data reusability across the company. • Collaborated with ML and Data Engineering teams to identify overlapping services and consolidated them into a single fault-tolerant platform, reducing operational costs and increasing system synergy. Stack: AWS, Databricks, MySQL, MongoDB, Apache Spark, Airflow, Debezium, Kafka, Spark Streaming, Terraform, CI/CD, Kubernetes, SageMaker, Python, Prometheus, Grafana
MySQL
MongoDB
AWS EMR
View more
MySQL
MongoDB
AWS EMR
Data Lake
Aws athena
Databricks
Debezium
View more
Staff Data Engineer
Banco Original
2019-07-01-2022-12-01
• Led the development of data pipelines focused on CRM, Customer Life Cycle, and Growth initiatives, reporting directly to the Executive ...
• Led the development of data pipelines focused on CRM, Customer Life Cycle, and Growth initiatives, reporting directly to the Executive Director. • Acted as the technical lead for a 10-person squad composed of data engineers, ML engineers, and data scientists. • Built a high-throughput ETL architecture using Python and Apache Spark on Hadoop to create the company’s Data Lake and integrate it with the CRM platform for campaign data refreshes. • Negotiated with cross-functional tech teams and led the configuration of the company’s first cloud provider. Migrated all data pipelines to Google Cloud using serverless and scalable services, reducing pipeline processing time by 80%. • Designed and deployed the company’s first real-time ML model for payment default prediction based on behavioural data from mobile interactions. The model was hosted on Vertex AI, with a BigQuery feature store and real-time integration with loan product backend systems. Stack: GCP, BigQuery, Oracle, Dataflow, Apache Beam, Hadoop, Hive, AWS, Data Lake, Terraform, Python
ETL
Google pubsub
Apache Beam
View more
ETL
Google pubsub
Apache Beam
Google Cloud Functions
Google Dataflow
Vertexai
View more