Abubakar Omer

Rising Codementor

US$10.00

For every 15 mins

ABOUT ME

Machine Learning Engineer

Machine Learning Engineer with 5 years of experience delivering production ML and applied AI systems for healthcare, insurance, enterprise, and government clients, specializing in document intelligence, multimodal extraction, retrieval and ranking, classification, and conversational analytics across AWS and Databricks. Owns evaluation, observability, and reliable production behavior. Delivered outcomes including 95% automation of manual coding work, 10+ engineering hours saved per week, and 25% lower customer churn in insurance retention workflows.

Arabic, English

Central Time (US & Canada) (-05:00)

Joined June 2025

EXPERTISE

Python

7 years experience

4 years experience

4 years experience

3 years experience

2 years experience

3 years experience

4 years experience

REVIEWS FROM CLIENTS

Abubakar's profile has been carefully vetted and approved as a Codementor. Connect with Abubakar now, and leave a review for them once you're done!

SOCIAL PRESENCE

GitHub

init.nvim

An Opinionated Neovim Config for the Minimalists

Vim Script

428

Dotfiles

KDE Plasma for twm users.

Shell

102

EMPLOYMENTS

Machine Learning Engineer

Unique Computing LLC

2024-11-01-Present

Consulting ML Engineer delivering production AI systems for Optum Serve, a federal healthcare client.

Multi-Agent A...

Consulting ML Engineer delivering production AI systems for Optum Serve, a federal healthcare client.

Multi-Agent Analytics Platform: Architected a multi-agent platform with an LLM-based intent classifier coordinating two specialized agents: a RAG retrieval agent over pre-generated reports and a DuckDB code-execution agent, enabling business users to query operational data conversationally without SQL or code.
SAS-to-Python Modernization Agent: Engineered an agentic SAS-to-Python pipeline using Claude Code CLI with a custom MCP server for documentation upload, GitHub push, and Databricks job execution, generating complete Python packages with tests and docs in a single invocation.
SQL-RAG Schema Classifier: Built a BERT-based multilabel classifier identifying schema columns from natural language queries. Synthesized 200k+ labeled examples via Azure OpenAI, tracked via MLflow on SageMaker, achieving 86% average per-label accuracy.
Automated Thematic Analysis: Designed a dual-stage metadata-driven LLM pipeline using fine-tuned GPT-4o on Azure OpenAI with multi-level parallelism, reducing survey processing time by 75% (30+ mins to <7 mins).
Form Type Classifier: Developed a CPU-only, training-free classifier for scanned federal medical forms using one-shot ORB feature matching and RANSAC alignment, achieving 95–98% confidence across 21 form types with a single exemplar per class.
VA Claims ACE Pipeline: Delivered a 4-phase multi-agent pipeline on AWS Bedrock: a preprocessing agent indexes PDFs into a tagged Knowledge Base; five domain-specialized agents perform tag-filtered retrieval with PydanticAI outputs; a Conflict Detection Agent flags inconsistencies; a rules-based ACE Determination Agent routes to human review, approval, or C&P exam.

Machine Learning

Computer Vision

Data Science

View more

Machine Learning

Computer Vision

Data Science

Presentations

Modelling

Deep Learning

AWS

Prompt Engineering

Generative AI

LLM

LLaMA

Retrieval-Augmented Generation

Retention

View more

Data Scientist

Unique Computing LLC

2021-10-01-2024-11-01

Consulting role providing data science solutions across pharmaceutical, government, and enterprise clients

Client: Merck...

Consulting role providing data science solutions across pharmaceutical, government, and enterprise clients

Client: Merck Pharmaceuticals

Cell Viability Prediction: Developed real-time cell viability monitoring system using custom YOLO object detection models, achieving 92% accuracy in collaboration with laboratory scientists to automate previously manual video analysis processes.

Client: US Census Bureau

Multi-Label Text Classification: Trained a custom BERT-based multilabel token classification model for race coding across a dataset of 2M+ entries, automating 95% of manual data coding tasks.

Internal Projects

GenNet AI: Launched an AI-powered physician platform for RAG-based querying over consultation transcripts and patient data with native OpenEMR integration; advised physicians and clinical stakeholders on GenAI adoption roadmap and self-hosted LLM deployment strategy, deploying fully on vLLM and h2oGPT.
Churn Prediction Platform: Brought a modular churn prediction pipeline from prototype to production on AWS Step Functions with ECS jobs, serving multiple clients simultaneously. Used PyCaret for automated model selection and SHAP for explainability, achieving 25% reduction in churn rates.

Machine Learning

Computer Vision

Data Science

View more

Machine Learning

Computer Vision

Data Science

Presentations

Modelling

Deep Learning

AWS

Prompt Engineering

Generative AI

LLM

LLaMA

Retrieval-Augmented Generation

Retention

View more

PROJECTS

structxView Project

2025

structx is a powerful Python library that extracts structured data from text using Large Language Models (LLMs). It dynamically generates...

structx is a powerful Python library that extracts structured data from text using Large Language Models (LLMs). It dynamically generates type-safe data models and provides consistent, structured extraction with support for complex nested data structures. Whether you're analyzing incident reports, processing documents, or extracting metrics from unstructured text, structx provides a simple, consistent interface with powerful capabilities.

LLM

View more

LLM

View more

Scrapy-LLMView Project

2024

Fully Automate the process of web scraping by leveraging a Large Language Model (LLM), use any existing LLM or bring your own!

LLM