Mateus Picanco

Mateus Picanco

Mentor
5.0
(3 reviews)
US$15.00
For every 15 mins
9
Sessions/Jobs
free badge
First 15 mins free for your first session
ABOUT ME
Full-stack Data Scientist with solid experience in Machine Learning, Natural Language Processing and extensive training in mentorship
Full-stack Data Scientist with solid experience in Machine Learning, Natural Language Processing and extensive training in mentorship

Hello everyone!

I'm Mateus, a full-stack Data Scientist from Brazil with a background in Digital Signal Processing. I graduated from Brown University in 2018 with a Bachelor of Science in Electrical Engineering and have been working in the Data domain ever since. I have experience with the full scope of the Data Science process, from building data pipelines and developing models to evaluating A/B tests for Data Products.

I'm open to any mentorship opportunities related to data, especially when it comes to Machine Learning, Natural Language Processing, Python, and Elasticsearch.

I am currently a Data Scientist at Microsoft and have previously worked at the largest investment bank in Latin America and at Telefonica.

Throughout my career, I developed skills and projects in various segments, including Customer Segmentation, Next Best Offer models, and detection of rare events.

I have over 4 years of experience in programming with Python, Machine Learning (especially when applied to CRM and Product Analytics), and Natural Language Processing. I also have solid experience implementing data orchestration and Elasticsearch-based analytics.

Finally, I have been a mentor for both for-profit and non-profit organizations since I was 16 years old. I'm extensively trained in mentorship and tutoring and have mentored all kinds of people in many topics, from essay writing to machine learning.

Portuguese, English
Atlantic Time (Canada) (-03:00)
Joined May 2020
EXPERTISE
4 years experience
- Object-oriented programming; - API development and deployment: FastAPI, Flask - Development of packages for Pypi and offline serving; -...
- Object-oriented programming; - API development and deployment: FastAPI, Flask - Development of packages for Pypi and offline serving; - Data Visualization and Dashboarding: Plotly, Dash, Streamlit - Unit and functional testing: Pytest - Data analysis toolkit: NumPy, Pandas, Seaborn, Matplotlib - Statistics and simulation: Statsmodels,SciPy - Machine Learning: Scikit-learn, XGBoost, CatBoost - Deep Learning: PyTorch - Digital Signal Processing: Librosa
4 years experience
- Fake names detection in and lead qualification for lead collection campaigns, - Anomaly detection in log-files; - Sentiment analysis in...
- Fake names detection in and lead qualification for lead collection campaigns, - Anomaly detection in log-files; - Sentiment analysis in financial news, topic modeling in customer complaints; - Incident prevention models in IT infrastructure and services; - Customer Segmentation based on spending behavior and unsupervised learning;
2 years experience
- Implementing data collection and ingestion using the entire Elastic Stack (Logstash, Beats, and Elasticsearch); - Modelling data in Ela...
- Implementing data collection and ingestion using the entire Elastic Stack (Logstash, Beats, and Elasticsearch); - Modelling data in Elasticsearch for analytics and search; - Data Visualization in Kibana (including custom Vega-lite visualizations); - Anomaly Detection using Machine Learning plugin;
KibanaLogstashBeats
View more
KibanaLogstashBeatsBash
View more
2 years experience
- Implementing data processing pipelines, including data cleaning and aggregation; - Building tables in Redshift and Athena from raw data...
- Implementing data processing pipelines, including data cleaning and aggregation; - Building tables in Redshift and Athena from raw data; - Automating and scheduling Spark jobs through Apache Airflow on AWS
3 years experience
- Data analysis and reporting with SQL in many dialects, especially Postgresql, Redshift and Oracle databases; - Advanced data aggregatio...
- Data analysis and reporting with SQL in many dialects, especially Postgresql, Redshift and Oracle databases; - Advanced data aggregation, such as window-functions, CTEs and complex joins;
1 year experience
- Developing APIs with Amazon Lambda and API Gateway; - Building data and machine learning pipelines with Elastic Map Reduce (EMR) and Sa...
- Developing APIs with Amazon Lambda and API Gateway; - Building data and machine learning pipelines with Elastic Map Reduce (EMR) and SageMaker; - Building data lake structures with AWS Big Data stack (Glue, emr, s3, redshift and lake formation);

REVIEWS FROM CLIENTS

5.0
(3 reviews)
RR100
RR100
November 2021
Great data science tutor. Explains concepts clearly and sticks with you until you solve your project. I highly recommend his expertise.
Hank H'ng
Hank H'ng
March 2021
Great session again :)
Hank H'ng
Hank H'ng
February 2021
Super knowledgeable and helped me set up my python env and explained everything clearly :) Would recommend.
EMPLOYMENTS
Data Scientist
Microsoft
2021-09-01-Present
Data Scientist working on Analytics and experimentation for the Microsoft Stream product. My work focuses on Scorecard definition and eva...
Data Scientist working on Analytics and experimentation for the Microsoft Stream product. My work focuses on Scorecard definition and evaluation for new flights and product iteration, as well as in-depth analyses of user behavior to drive product changes.
Python
Azure
Machine Learning
View more
Python
Azure
Machine Learning
Product Design
Data analytics
View more
Data Scientist
BTG Pactual
2020-09-01-2021-09-01
- Developed segmentation and scoring aimed at identifying the clients with the highest propensity for acquiring credit products in the BT...
- Developed segmentation and scoring aimed at identifying the clients with the highest propensity for acquiring credit products in the BTG+ user base. - Developed an unsupervised model for customer segmentation based on credit card spending preferences in different merchant categories (MCCs). Results drive targeted campaigns; - Designed and built numerous ETL pipelines using Apache NiFi, Airflow, and Spark on AWS. Pipelines populate the BTG+ Data Lake for various purposes, including reporting, dashboards, and modeling; - Designed and implemented the architecture and pipelines for a data quality assessment framework based on Apache Spark, Airflow, and AWS Glue; - Designed, trained, and deployed a BERT-based Sentiment Analysis model for classifying news related to the Stock Market in Portuguese. The model is part of the BTG Index.
Python
Amazon S3
Machine Learning
View more
Python
Amazon S3
Machine Learning
DynamoDB
Apache Spark
Amazon Redshift
AWS Lambda
Apache Airflow
Apache NiFi
View more
Data Scientist
Telefonica
2019-02-01-2020-08-01
- Built and maintained data pipelines based on Elastic.co stack (Elasticsearch, Logstash, Kibana). Pipelines process around 4TB of data p...
- Built and maintained data pipelines based on Elastic.co stack (Elasticsearch, Logstash, Kibana). Pipelines process around 4TB of data per month; - Designed and implemented a data self-service platform based on Elasticsearch for IT Operators. Data stores and reporting tools are used by approximately 130 people and provide access to 80 different curated data sources; - Developed an ML model for predicting the occurrence of IT incidents in Telefonica's Online Charging System. For the to the two-month period in which it went into production, it reported a f-1 score of around 84% and prevented 3 critical incidents or possible outages in the system since deployed; - Developed a model for identifying spammy and fake names in lead collection campaigns for the Marketing department. Deployed the model as a REST API using FastAPI; - Built a custom parser using PySpark for handling event and configuration data stored in XML files following the 3gpp industry specification. - Wrote and executed procurement processes for software acquisition and services using RFPs (Request for Proposal); - Wrote material for internal, team-led workshops with other teams across the company;
Python
MySQL
Flask
View more
Python
MySQL
Flask
Elasticsearch
Kibana
Logstash
Apache Spark
View more
PROJECTS
Data PagesView Project
N/A
2020
This project is a proof of concept for data exploration of enterprise data by non-technical users using full-text search engine capabilit...
This project is a proof of concept for data exploration of enterprise data by non-technical users using full-text search engine capabilities. It aims to illustrate a way to foster data-driven decision-making without the intervention of technical teams, a concept known as Data Self-Service. It was inspired by Looqbox, a Brazilian startup. Data Pages consists of guided access to analysis previously made available by technical teams as specifications in a data directory. The search engine capabilities provide a cleaner interface to find relevant data about the company, an alternative to list-based directories and generic dashboards.
Python
Heroku
Pandas
View more
Python
Heroku
Pandas
Elasticsearch
Streamlit
View more
Project Atlas - São PauoView Project
2021
A feature store project aimed at developing geospatially referenced features regarding the city of São Paulo, including features related ...
A feature store project aimed at developing geospatially referenced features regarding the city of São Paulo, including features related to crime, real state, income, shopping activity, and much more. The project has been released on Kaggle and contains over 200 features at different levels of interest for use.
Geospatial Technology
Apache Spark
PostGIS
View more
Geospatial Technology
Apache Spark
PostGIS
Apache sedona
View more