Alva  Rani James, PhD

Alva Rani James, PhD

Mentor
Rising Codementor
US$10.00
For every 15 mins
View Video
ABOUT ME
Senior Bioinformatics Data scientist
Senior Bioinformatics Data scientist

I am a Senior Bioinformatics Data Scientist with experience integrating multi-omics datasets, developing reproducible pipelines, and leading cross-disciplinary initiatives for data insights and drug discovery in rare diseases.

By 2024 I have lived studied📚 and worked🦾 in five countries:

🇮🇳 India, 🇸🇪 Sweden, 🇮🇹 Italy, 🇩🇪 Germany, Switzerland 🇨🇭

I have a demonstrated research and development history, specializing in Next Generation Sequencing Data Analysis, bioinformatics pipeline development, machine learning, deep learning, and drug discovery for 9-plus years.

As a Bioinformatician, I have had the opportunity to collaborate with diverse groups of people across multiple disciplines, ranging from academia, and industries to hospitals. Throughout my career, I have been actively involved in presenting various projects, and their outcomes, and exploring potential avenues for project collaboration.

I have published nine papers in high-impact scientific journals, showcasing my expertise in data insights generation and interpretation for rare diseases and cancer.

Berlin (+02:00)
Joined May 2024
EXPERTISE
9 years experience
5 years experience
9 years experience
4 years experience
1 year experience
9 years experience
3 years experience

REVIEWS FROM CLIENTS

Alva's profile has been carefully vetted and approved as a Codementor. Connect with Alva now, and leave a review for them once you're done!
SOCIAL PRESENCE
GitHub
Data_analysis
The python snippets for manipulation of big data on a daily basis
Jupyter Notebook
0
0
CWLtool-pipeline-development
Common Workflow Language
0
0
EMPLOYMENTS
Senior Bioinformatics Data Scientist
Centogene AG
2021-09-01-Present

Investigated and integrated multi-omics datasets via machine learning algorithms using a hodgepodge of tools such as Git/GitHub, Pytho...

Investigated and integrated multi-omics datasets via machine learning algorithms using a hodgepodge of tools such as Git/GitHub, Python (scikit-learn, pandas, jupyter notebook, et al), AWS, Docker, bioinformatics algorithms, R, R-Shiny, linear regression, NLP, BERTopic, workflow management languages (Snakemake), and project management systems (Jira) for the diverse research teams and collaborators.

  • Responsible for co-leading cross-disciplinary initiatives to derive data insights from rare disease patients omics datasets by leveraging statistical, machine learning, and exploratory data analysis techniques
  • Lead role in the data insights completeness analyses: Coordinated research projects to ensure timely reporting and interpretation of results to multidisciplinary research teams
  • Applied generative AI to develop structured real-world data products for Pharmaceutical companies including Sanofi
  • Developed reproducible pipeline for transcriptomics and metabolomics, and risk score for large patient datasets
  • Build and deploy machine learning models for cross-disciplinary initiative, interpretation and explanation of predictions to various stakeholders
Python
NLP
Snakemake
View more
Python
NLP
Snakemake
View more
Post doctoral Researcher
Hasso Plattner Institute
2020-02-01-2021-08-01

Identified the functions of non-coding RNAs in multiple cancer types by leveraging multi-omics data sets from the TCGA consortium and ...

Identified the functions of non-coding RNAs in multiple cancer types by leveraging multi-omics data sets from the TCGA consortium and external evidence. Applied deep learning algorithms (Convolutional neural network, autoencoders) and bioinformatics tools for investigation and analysis of the data sets using libraries like TensorFlow, pandas, scikit-learn, NumPy, seaborn, and spark. Mentored master students for mini projects, reporting, and presentations.

  • Developed reproducible pipeline using nextflow pipeline management system for variant effect prediction for UKbiobank exon seq data (github.com/HealthML/Nextflow-pipeline)
  • Developed evidence-based functional prediction method for non-coding RNAs dysregulated in multiple cancer types
SQL
Deep Learning
Bioinformatics
View more
SQL
Deep Learning
Bioinformatics
TensorFlow
Keras
Nextflow
AWS
View more
Data Analyst
ETH University
2018-09-01-2019-12-01

Analysed multiple omics cancer datasets arrived from hospitals across Switzerland.

  • Developed multiple reproducible Bioinfor...

Analysed multiple omics cancer datasets arrived from hospitals across Switzerland.

  • Developed multiple reproducible Bioinformatics pipelines using workflow management languages (RNA-seq pipeline) for NGS data sets (tetra byte) for three high-profile research projects.( The Tumor Profiler Study, Roche funded)
  • Data integration and data structure management for data sets from multiple technologies including RNA-seq, Single-sequencing, proteomics, exome-seq, and whole-genome seq
  • Versioning the codes and querying LabKey database for the metadata information of patients samples (Github, python APIs)
  • Documented and developed pipelines for various audience
  • Developed cancer-specific score for a given cancer type using Bayer score via R tool (hypoxia score)
  • Sun grid engine and cloud computing - Clusters and Parallelization
Python
Shell
Cloud
View more
Python
Shell
Cloud
Bioinformatics
Snakemake
View more
PROJECTS
Data visualization View Project
2023
Multiple ways to visualise data using python
Multiple ways to visualise data using python
Matplotlib
Python 3
Seaborn
Matplotlib
Python 3
Seaborn
RNA-seq snakelike pipelineView Project
ETH Zurich
2019
The reproducible RNA-seq snakelike pipeline. From raw RNA-seq fast file to differential expression analysis
The reproducible RNA-seq snakelike pipeline. From raw RNA-seq fast file to differential expression analysis
Python
R
Shell
View more
Python
R
Shell
Bioinformatics
Bioconductor
Snakemake
View more