Research Notebook

alt-text-1

Mathematics for Machine Learning (by Coursera)

I took this course to acquire an understanding of the mathematical calculations and transformations that occur behind the scenes of the code for a machine learning algorithm. The course had three parts where I learned the following:

Linear Algebra:

Fundamentals of vectors: changing basis and linear independence.
Matrix transformations: inversion and rotation, to name a few.
Einstein Summation Convention
Gram Schmidt Process with coding assignment
Eigenvectors and PageRank algorithm with coding assignment

Multivariate Calculus:

Derivatives: Jacobian, Hessian, Maclaurin Series, Taylor Series
How back propagation relates to neural networks with coding assignment
Newton-Raphson Method
Gradient Descent with coding assignment
Mathematics behind Simple Linear Regression

Principal Component Analysis (PCA):

Statistics of datasets
Inner products
Orthogonal projections
PCA: the most commonly used method to reduce the dimensions of a dataset.

Machine Learning A-Z (by Udemy)

I took this course to learn about the use cases and implementations of many different types of machine learning algorithms. The course covered several data preprocessing steps, along with many regression, classification, clustering, and association rule learning algorithms. I also learned about more advanced forms of machine learning, such as reinforcement learning, natural language processing, and deep learning. The course then went over dimensionality reduction algorithms such as PCA, techniques to select a model, and lastly, a brief look into ensemble learning through XGBoost.

I will be implementing many of the data preprocessing techniques I learned, since my dataset has categorical variables and missing values. I learned how to implement the algorithms I chose for my project–Random Forest, Logistic Regression, and K-Nearest Neighbors–and developed a more profound understanding of binary classifiers. The course opened my eyes to how clustering would be insightful for my dataset and how it would add an another facet of originality to my project. I also learned performance evaluation techniques that I will be using as my dependent variables, which are confusion matrices and the Cumulative Accuracy Profile.

For a deep dive into everything that I learned from this course, take a look at this presentation.