Applying for an ML position? Read this first

Invalid Date

blog banner

Basic expectations from candidates appearing for an ML interview

Having conducted about 50 odd interviews for the position of Machine Learning Engineer and having been asked by many what are the expectations for the position, I decided to write a post on the subject. This is not a list of questions we ask, nor a comprehensive list of all topics to prepare. Rather, an article on what are the major factors usually leading to ML candidates being selected/ rejected based on my experience. Jumping into the topic straight away.

Things you mention on your CV: Be thorough on the projects/ topics you mention on your CV. I’ve seen many people mentioning projects they never worked on their CV to sound impressive — the truth is it’ll just hurt your odds of getting selected. Mentioning ML/ DL buzzwords under projects and not knowing about them does NOT help your chances.

You could mention very basic things like PCA or KNN: but the key is knowing in depth how those methods work. Having used tools/ functions that libraries offer without even knowing what they’re good for is a clear red flag.

Then, there are few must-know concepts if applying for ML role:

Gradient descent
Overfitting/ underfitting
Loss functions, basics of optimizers (the role they play in gradient descent)
Cross validation, regularization
Confusion matrix
Basic probability/ statistics and some linear algebra
Feature cleaning/ normalization/ selection
One or more basic Machine Learning or Deep Learning methods (whichever you’ve worked on)

And if you’ve worked in NLP:

the common NLP pipeline: stemmer, lemmatizer, tokenizers, common word vectorization methods.

If your work was DL heavy:

Batchwise processing
Backpropagation
Loss functions, optimizers
Hyperparameters, what to tune
Components of NN/ CNN/ RNN (whichever you’ve used)

Knowing just these isn’t enough. But if you don’t know even these basic ideas, high chances are you’ll get rejected in first few rounds.

Do NOT bluff: No one knows all of the concepts and that is perfectly alright. You’re not expected to know it all. Most times questions are designed to lead to the max level to which you know the subject. And the interviewer immediately knows within first few sentences on how well you know it. Long winded explanations leading to nothing will just hurt your chances of getting selected.

DL Toolkits: Communicate truthfully about the tools you have worked with and your level of familiarity with them. That sets the right expectation level for the interviewer and they know which level of questions to ask. Basic familiarity with these is expected of most ML roles.

Pandas/NumPy: If you’ve worked on ML/ DL projects, you cannot have gone by without rubbing shoulders with these basic libraries. Not knowing how to perform the basic operations with these reflect that you haven’t worked on projects involving any actual data. So, always advisable to know the basic functionalities of these.

Also, some basic understanding of data structures and algorithms is expected of the candidates since they’re ubiquitous across projects.

Exploration beyond your assigned task: Knowledge of the recent trends and more background of the topic (even methods that you didn’t directly use in your projects) shows that you are passionate about the topic and have explored it beyond the strictly required basics. (Example: I could’ve used just faster RCNN for my project but still know something about methods like Yolo and SSD)

Know some logistics about the datasets you used: With the advent of DL, many people just use off the shelf ML methods without having explored the dataset. This leads to candidates being clueless about the scale/ features in the dataset.

Bonus points if:

You know about state of the art methods in the domain you’re working on and have explored recent trends
Some good (even small scale) projects on GitHub/Kaggle
You answer succinctly (Yes! very few people have the skill)
Know linear algebra concepts

Some major red flags (for interested readers)

Have used CNNs for multiple tasks: Not sure why it has a dropout layer/ what it does
Did my thesis on GANs: unsure of what was the loss function used in them
Bluffing about what convolution operation does in the Conv layer of CNN
Have 3 years of work experience but have only used off the shelf tools over the years and don’t know how they work.
Changing the parameter C increased the accuracy, not sure what C does

Lastly, your rejection could also be the result of the company looking for particular people with more hands on experience or some research background. So, no need to be demotivated if you don’t make it through the rounds.

One more post on good resources to prep for the above things to follow. Let me know in case of any suggestions/ questions in the comments and I’ll try my best to answer.

Wishing you all the very best for your journey!

P.S: Don’t google stuff during web interviews.