Interview resources : ML/Data Science/AI Research Engineer

A curated list of topics, resources and questions

Purvanshi Mehta
3 min readFeb 15, 2021

Interviewing is a grueling process, specially during COVID. I recently interviewed with Microsoft (Data Scientist ll), Amazon (Applied AI Scientist) and Apple (Software Development : Machine Learning).

Though all these interviews differed a bit, but the basic questions asked were the same. During the process I curated this list which would help you pass all ML interviews.

NOTE : This list is just for end moment revising

Machine Learning

Linear, Logistic regression-http://cs229.stanford.edu/notes2020spring/cs229-notes1.pdf

Naive Bayes- https://towardsdatascience.com/naive-bayes-classifier-81d512f50a7c

SVM / Kernel- http://cs229.stanford.edu/notes2020fall/notes2020fall/cs229-notes3.pdf

Random Forests, decision Trees, Boosting, Bagging, Xgboost- StatQuest Youtube videos https://www.youtube.com/watch?v=J4Wdy0Wc_xQ

EM Algorithm- http://cs229.stanford.edu/notes2020spring/cs229-notes8.pdf

K means-https://towardsdatascience.com/understanding-k-means-clustering-in-machine-learning-6a6e67336aa1

K nearest neighbors- https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/

Evaluation Metrics (scroll to the definition section, you need to know the confusion metrics, precision, recall, type I, type II, FP rate, sensitivity)-https://en.wikipedia.org/wiki/Precision_and_recall

Regularization (L1,L2, Why is L1 sparse?) https://explained.ai/regularization/L1vsL2.html

Bias Variance Trade off

Dimensionality Reduction-

Deep Learning

The first thing I would suggest to do is to go through all the deeplearnig.ai courses which is pretty basic. If someone already publishes/ works in these topics they might just skip watching all the videos and can go through the following questions/ resources-

NLP

For NLP CS224 (5) Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 1 — Introduction and Word Vectors — YouTube) covers the basics of NLP with Deep Learning. This might cover 3/4 of the questions asked in an interview. Other questions are usually more state of the art models as the interviewer wants to check how updated you are.

Other topics —

  1. Linear Algebra-https://www.deeplearningbook.org/contents/linear_algebra.html
  2. Probability basics- http://www2.ece.rochester.edu/~gmateosb/ECE440/Slides/block_2_probability_review_part_a.pdf
  3. Stats- I had taken a graduate level Statistics class so I didnt need to brush this up but Khan Academy https://www.khanacademy.org/math/statistics-probability is a very good source for learning basics with examples.

These are the topics which are asked in all interviews, obvious then some questions were specific to research I had done. There were also live coding rounds both of algorithms and NN models. Let me know if I missed something.

--

--