Mode of Communication: Bangla
Duration: 12 Weeks (3 months) and a project
Frequency: Once a week (Saturday)
Live Session Time:
Lecture (2 hours):
Bangladesh Time: 9:00 PM ~ 11:00 PM
USA Pacific Time (CA/OR/WA): 7:00 AM ~ 9:00 AM
UK Time: 3:00 PM ~ 5:00 PM
Total Classes: ~ 30 hours of live lectures and support
Assignments: Weekly Problem Solving exercise
Coding Support: Engagement Through Google Classroom
Projects: Hands-on, comprehensive projects
Contact: oxfordbiodiscoveryventures@gmail.com
Prerequisites: None. This course is designed for complete beginners (no programming experience required).
Python basics: variables, lists, loops and functions, Introduction to Python for life-science researchers, Using Google Colab: running cells, uploading files, saving notebooks, Introduction to data tables using pandas, Basic plots with matplotlib
Lists, dictionaries and pandas indexing, Identifying data quality issues in biological datasets, Handling missing values, Scaling and normalization, Basic statistics, EDA workflow, Introduction to train/test split
ML vs classical statistical approaches, Supervised vs unsupervised learning, Classification and regression, Model training, validation and testing, Key metrics: accuracy, AUROC, RMSE
Feature scaling: StandardScaler, MinMax, Feature selection: variance filtering, ANOVA, L1/Lasso, Overfitting and underfitting, Model complexity and regularisation, Hyperparameter tuning with GridSearchCV, Building ML pipelines
Why biomedical datasets are often imbalanced, Precision, recall, F1 score, Precision–recall curves, Oversampling (SMOTE) and class weighting, Calibration curves and reliability plots
Clustering: k-means and hierarchical clustering, PCA: variance, components and projection, t-SNE and UMAP for biological data, Visualising biological structure (for example, cell populations)
What deep learning offers for biological research, Structure of neural networks, Activations, losses and optimisers, Training loops and overfitting, Introduction to PyTorch or TensorFlow
Understanding convolutions, kernels and feature maps, CNN architectures, Transfer learning for biological images, Data augmentation, Evaluation and visualisation of predictions
String manipulation for biological sequences, Encoding DNA and protein sequences, 1D CNNs for motif detection, RNNs/LSTMs (conceptual overview), Sequence classification tasks, Basic sequence-model interpretation
Introduction to single-cell datasets, Scanpy workflow: normalisation, HVG selection, neighbours, UMAP, Clustering and marker gene identification, Integrating ML for cell-type prediction, Using ML for automated annotation
Introduction to Transformer architecture, Self-attention and attention heads, Sequence modelling using Transformers, Applications in genomics, proteomics and biomedical NLP, Fine-tuning pretrained models, Visualising attention maps
Student project presentations, Writing reproducible ML workflows, Version control (GitHub basics), Managing randomness and seeds, Documentation and workflow organisation, Suggested pathways for advanced learning in ML for life sciences
Mohammad Arafat Hussain
Postdoctoral Research Fellow
Boston Children's Hospital,
Harvard Medical School,
Boston, USA
PhD in Biomedical Engineering,
University of British Columbia,
Vancouver, Canada
Oxford, Oxfordshire,
United Kingdom
oxfordbiodiscoveryventures@gmail.com