Faisal Malik

Faisal Malik

Mentor
Rising Codementor
US$10.00
For every 15 mins
ABOUT ME
Data Engineer
Data Engineer

Data Engineer with experience in various startups and currently at one of the biggest consulting firms. As my experiences on startups, I gain technical skills to build end-to-end data pipelines from scratch. Ingesting various sources, processing them, loading them to data destination, optimize the storage-compute decoupling, modeling data destination and implement data governance on the business users. These skills allow me to convert messy raw data into data-driven actionable insights. Other than that, I also developed my soft skills especially during my current position at McKinsey, I learned a lot about consulting, problem solving, team leading and many more.

Indonesian, English
Singapore (+08:00)
Joined November 2022
EXPERTISE
7 years experience
7 years experience
6 years experience
6 years experience
3 years experience
6 years experience
5 years experience

REVIEWS FROM CLIENTS

Faisal's profile has been carefully vetted and approved as a Codementor. Connect with Faisal now, and leave a review for them once you're done!
SOCIAL PRESENCE
GitHub
data-driven-growth
Data Science and Machine Learning Implementation in Python to gather some insights for company growth
Jupyter Notebook
2
0
Paillier-Linear-ML
Implementation of Paillier's Homomorphic Encryption to the Linear Machine Learning Model.
Python
1
0
EMPLOYMENTS
Senior AI Engineer
US Biolab
2024-02-01-Present
  • Design an end-to-end machine learning flow to perform semantic segmentation on biospecimens from data labeling to model serving i...
  • Design an end-to-end machine learning flow to perform semantic segmentation on biospecimens from data labeling to model serving in the AWS ecosystem
  • Provision serverless infrastructure on AWS using Lambda, SQS, DynamoDB, S3, ECS, and Fargate to enable tile-level model inference parallelization.
  • Automate experimentations, training, and hyperparameter tuning process using GitHub Actions dispatch workflow that provisions on-demand runners on EC2, which execute the job.
  • Implement various image transformations to improve segmentation results.
Python
TypeScript
Rust
View more
Python
TypeScript
Rust
PyTorch
Semantic segmentation
AWS
Bunjs
View more
Senior MLOps Engineer
Walleye Capital
2024-11-01-2025-11-01
  • Migrated all async calls implementation to the OpenAI API to the OpenAl Batch API in the Kubeflow Pipeline, which reduced OpenAI ...
  • Migrated all async calls implementation to the OpenAI API to the OpenAl Batch API in the Kubeflow Pipeline, which reduced OpenAI token consumption costs by 50% and eliminated all parallel invocation instances from Kubeflow components.
  • Standardize the pipeline using multiple reusable Kubeflow components, allowing all team members to productionize their Pipelines seamlessly.
  • Set up CI/CD Pipeline to test and deploy from GitHub to the GCP ecosystem.
  • Set up GitHub Actions Workflow Dispatch to submit and schedule Vertex Al Pipelines.
  • Provisioned and managed GCP infrastructure using lac tools like Pulumi.
  • Refactored experimental codes from data scientists to meet production standards and deployment.
  • Integrated different services and components built by other team members so the system can run smoothly and efficiently.
Python
SQL
Google BigQuery
View more
Python
SQL
Google BigQuery
OpenAI
Vertexai
RAG
View more
Senior Data Engineer
Hivello
2024-07-01-2025-10-01

All-in-One Decentralized Physical Infrastructure (DePIN) manager.

  • Design and implement a Data Ecosystem in GCP from the gro...

All-in-One Decentralized Physical Infrastructure (DePIN) manager.

  • Design and implement a Data Ecosystem in GCP from the ground up using Terraform and GitHub Actions.
  • Design and implement log centralization from end-user applications to Cloud Logging.
  • The logs are routed hourly to Cloud Storage and transformed upon arrival using Cloud Run Jobs with an adjusted instance size based on the log size.
  • The transformed logs are stored in BigQuery for further analytics, which utilize a Materialized View to optimize Aggregate retrieval.
  • This strategy enables flexible analytics to accommodate new use cases without requiring changes to the log source.
  • Develop a Blockchain indexing framework using a subgraph to index multiple DePINs' on-chain earnings data.
  • This allows the data to be accessed using GraphQL and analytics purposes.
  • Enable self-serve analytics throughout the company by provisioning Metabase and connecting BigQuery as a data source with business-friendly data models.
  • This makes the company data-driven even without a data analyst to translate their questions into SQL queries.
  • Visualize core business metrics in Grafana with optimized performance and views.
Python
Logging
Google BigQuery
View more
Python
Logging
Google BigQuery
Google Cloud Platform
Blockchain
Grafana
View more