Home About
About
Cancel

About Me

I’m Tom Pham, a self-taught data scientist focusing on machine learning and deep learning. I am interested in implementing artificial intelligence to solve various problems. With a diverse cultural background and multi-disciplinary knowledge, I seek to tackle all issues from various angles.

Workspace setup
Hardware 2019 15-inch MacBook Pro 2,6 GHz 6-Core Intel Core i7 16 GB 2400 MHz DDR4 RAM Radeon Pro 555X 4 GB 32-inch external monitor
Workspaces Remote Jupyterlab via [Paperspace Gradient Notebooks](https://www.paperspace.com/gradient/notebooks) IDE: Visual Studio Code

Education

Biology

Graduate level of biology with a focus on hematopoietic stem cells:

  • Bachelor of Arts in General Biology and Biochemistry

  • Master of Science in Stem Cell Biology

Data Science

Self-taught via various resources with hands-on practical exercises and projects.

Coursera:

These lessons are beautifully delivered by Andrew Ng and cover the fundamental concepts of machine learning in details as well as helpful advices when implementing them.

Practical Deep Learning For Coders book and lectures from 2022

Taught by the founder of fastai, Jeremy Howard, this course uses a unique top-down approach that applies the production-level deep learning models first and then gradually covers the theory behind them. As a result, learners are able to rapidly solve real-world problems with AI models. For more information on why this course is in the frontier of deep learning for beginners, check out this article.

Courses and certificates from Kaggle:

Python for Data Analysis book, 3rd edition by Wes McKinney

This book is written by the creator of the Python library Pandas and goes in depth into the programming language as well as its Numpy, Pandas, and Matplotlib packages.

German as a foreign language

Level B1.2 in German according to the Common European Framework of Reference for Languages (CEFR)

Show certificate ![B1.2 German](/assets/posts/about/B1-2_German.png)

Experience

Applied flowSOM, an unsupervised learning algorithm which uses Self Organizing Map, Minimum Spanning Tree, and Hierarchical Clustering to cluster flow cytometry data. The high-parameter data is visualized with t-SNE and UMAP dimensionality reduction algorithms.