Advice on taking the AWS Machine Learning - specialty exam

Some useful guidelines & tips for passing the AWS Machine Learning–Specialty exam from someone who just passed it

Hi, my name is Fabrice Guillaume and I recently passed the AWS Machine Learning - specialty exam. I thought I would share some of my experiences to help others prepare for this exam, that I would categorize as “rather challenging but totally do-able if you are well prepared.”

This blog has three parts. First, I will cover why someone might want to take the AWS Machine Learning - Specialty exam. Then I will share some of the online materials that I found useful to prepare for the exam. And finally I will provide some tips for passing the exam.

1. Why should you take the AWS Machine Learning - specialty exam?

First, to stay up to speed with new developments in tech and ML/AI. These disciplines move extremely fast and are in the process of revolutionizing our world with some really interesting capabilities enabled in our day-to-day life. For example, recommendations engines like Netflix suggesting the next show to binge based on other customer’s selections, Spotify and its personalization of audio tracks based on user preferences and community’s feedback, Apple’s personal assistant Siri, Amazon Alexa’s Smart Home Hub to control many connected devices, and Tesla’s Autopilot (self driving) car capabilities. Not to mention Capital One’s very own intelligent assistant Eno that helps look out for you and your money 24/7. Bottom line - the ML/AI revolution is happening already, and it is only going to accelerate and become further entrenched in our day-to-day lives. This means staying up to speed with this ever-moving and accelerating high speed train of Transformative/Predictive/Revolutionary AI based capabilities is paramount in the fields of tech and machine learning engineering.

Second, to better understand what machine learning techniques to use and when. The ML/AI toolbox is ever growing. The open source community is super active with researchers and practitioners from industry and academia publishing code to train/build/deploy and scale models under permissive licenses and cloud vendors and technology companies enabling and scaling ML capabilities in an easy to use SaaS model. While all these groups are working to make it easier to build and deploy models, has this actually become “easy”? While it may become easier with SaaS offerings and some “auto ML” capabilities enabled by several companies, I personally feel it is important to understand what is happening under-the-hood in the lifecycle of a Machine Learning project, especially when evaluating what data features to use/extract/suppress/encode, what algorithms to fit, what hyperparameters to tune, when to re-train or refit a model already in production, and so-on. Hence, further education on the ML building blocks, tools and libraries will allow you to know what to use, when to use it, how to use it and monitor it; which is all quite important.

Lastly, this exam is also well regarded in the software/data/AI industry. It is not just an advanced level theoretical exam, it’s also quite applied, covering the latest modeling techniques and cloud services. And while some of the topics are AWS specific, the learnings are universally applicable in many industries, making the certification attractive to companies looking for machine learning engineers that can support the entire lifecycle of a machine learning project.

Considerations before signing up for the Machine Learning AWS - Specialty exam

You may have questions about whether the Machine Learning AWS - Specialty exam is for you. I know I did. Here are my thoughts on who should and should not take it, as well as a little bit more about the format and structure of the exam.

Recommended past knowledge and experience

In order to pass the Machine Learning AWS - Specialty exam, one does need some experience in AWS and machine learning, namely:

  • 1 to 2 years of experience developing, architecting, or running machine learning/deep learning workloads on the AWS Cloud.
  • Understanding and intuition behind basic ML algorithms.
  • Experience performing basic hyperparameter optimization.
  • Experience with machine learning and deep learning frameworks.
  • The ability to follow model-training best practices.
  • The ability to follow deployment and operational best practices.
  • Prior experience with AWS, in particular with IAM and VPC. Preferably, exposure/experience with streaming services (Kinesis family) and batch services (AWS Glue, Batch, Data Pipelines)

However, what one does not need to take the exam is:

  • A Python guru - There won’t be any Python coding exercise per se, but you need to be able to read and potentially use it as you leverage the Python SDK of SageMaker and read developer guides.
  • A data scientist guru - Although it will help if you are. While there will be advanced modeling questions such as what hyperparameters to use and when. One can learn about it with some practice fitting various algorithms.
  • An AWS Certified Solutions Architect - Although it will help if you are. There will be some security related questions about how to secure data in transit and at rest, how to restrict access to a notebook, as well as the use of various AWS data services for ETL and storage. But one can learn about those as part of various exercises if you haven’t been exposed to them already.

To set a bit of context about my background experience, I would categorize myself as a Machine Learning Engineer Lead with lots of experience in data, big data, software and cloud engineering. I started dabbling in the machine learning space at a local startup seven years ago before joining Capital One three years ago to be Director of Data/Machine Learning Engineering working on a machine learning platform initiative. I do not hold a PhD, but I do have a masters degree in wireless telecommunication with a concentration in math. Having said that, you do not need a masters or PhD degree to pass this exam, nor will you go really deep into math. Same with coding, you will NOT code, nor need to do code reviews, to be ready for this exam. However, it can help.

Topics covered during the Machine Learning AWS - Specialty exam

Per the AWS exam guide, you will be quizzed on 4 key areas:

  1. Data Engineering (20%): S3 (and VPC Endpoint Gateway), Kinesis (Streams, FireHose, Data Analytics, Video), Glue (Data Catalog and Crawler), Athena, AWS Data Stores (Redshift, RDS/Aurora, DynamoDB, ElasticSearch, ElastiCache), AWS Data Pipelines, AWS Batch, AWS DMS, and AWS Step Functions.
  2. Exploratory Data Analysis (24%): Data Types and Distribution, Time Series, Amazon Athena, Quicksight, Ground Truth, EMR, Spark, Data binning, Transforming, Encoding, Scaling and Shuffling, Dealing with Missing data, Imbalanced data, and Outliers.
  3. Modeling (36%): CNN, RNN, Tuning neural networks, Regularization, Gradient descent method, L1 and L2 regularization, Confusion matrix (Precision, Recall, F1, AUC), Ensemble methods (Bagging and Boosting), Amazon Sagemaker, Amazon Algorithms (Linear Learner, XGBoost, Seq2Seq, BlazingText, DeepAR, Object2Vec, ObjectDetection, Image Classification, Semantic Segmentation, RCF, LDA, KNN, K-Means, PCA, Factorization Machine), and Amazon AI Services (Comprehend, Translate, Transcribe, Polly, Rekognition, Forecast, Lex etc).
  4. ML Implementations and Operations (20%): SageMaker Production Variants, Neo, IoT Greengrass, Encryption at Rest and in Transit, VPC, IAM, Logging, Monitoring, Instance Types and Spot Instances, Elastic Inference, Auto-Scaling, Availability Zones, Inference Pipelines etc

Additional info about the Machine Learning AWS - Specialty exam

This exam costs $300 and is 3 hours long (180 mn). You will have 65 questions with multiple choices, sometimes with multiple responses. Personally, I found I had ample time to answer the questions, potentially mark the ones I was not sure about, and review them later on. I didn’t feel I had to rush through the questions and always had time in the end to review the few I had flagged and still had time left at the end.

To pass the exam, you will need a minimum score of 750 on a scale between 0 and 1000. There is no point given for unanswered questions. And be careful! There is no partial credit for questions (if you get 2 out of 3 right out of 5, you get no point).

The final score is a “scaled” scoring and questions are alternated. Here is more info about the scaled scoring: “AWS Certification exam results are now reported as a score from 100 to 1000. Your score shows how you performed on the examination as a whole and whether or not you passed. Scaled scoring models are used to equate scores across multiple exam forms that may have slightly different difficulty levels.“  (source AWS support page)

2. Online resources and courses for the Machine Learning AWS - specialty exam

Below are some useful online resources. The references below do not constitute an endorsement or recommendation of their services. I recommend that you explore other online content to find that works for you. Also keep in mind that AWS introduces new services every year and hence there will be either additions/updates to these online courses and new ones will be offered.

And for those that really want to focus on a subset of the courses, I would recommend attending at least one of the 2 (CloudGuru and/or Udemy that both give your training and hands on lab) as well as AWS white papers as a 'bare minimum’.

A Cloud Guru (paid course, ~15 hours, 1 practice test)

The AWS Certified Machine Learning — Specialty 2020 video training will prepare you for the AWS Certified Machine Learning Speciality exam, with some great materials and hands-on labs.

In this course, you will learn:

  • The domains of knowledge for the AWS Certified Machine Learning Speciality exam
  • Best practices for using the tools and platforms of AWS for data engineering, data analysis, machine learning modeling, model evaluation and deployment
  • Hands-on labs designed to challenge your intuition, creativity and knowledge of the AWS platform

With this course you’ll get a solid understanding of the services and platforms available on AWS for Machine Learning projects, build a foundation to pass the certification exam and feel equipped to use the AWS ML portfolio in your own real-world applications.

I personally enjoyed both trainers (Scott Pletcher and Brock Tubre) and found the slides they used very informative and to the point. I went back to them a few times to recap a few topics. Also, the labs are great (and fun). You can follow along the video or try it yourself first and they give you the data sets and code if you want to go hands-on.

Udemy (paid course, ~10 hours, 1 practice test)

The AWS Certified Machine Learning Specialty 2020 — Hands On! online course is another great one to attend.

The course is taught by Frank Kane, who spent nine years working at Amazon in the field of machine learning. Frank took and passed this exam on the first try, and knows exactly what it takes for you to pass it. Joining Frank in this course is Stephane Maarek, an AWS expert and popular AWS certification instructor on Udemy.

In addition to the 9-hour video course, a 30-minute quick assessment practice exam is included that consists of the same topics and style as the real exam. You’ll also get four hands-on labs that allow you to practice what you’ve learned, and gain valuable experience in model tuning, feature engineering, and data engineering.

Whizlabs (paid, ~ 10 hours, 2 practice exams)

Finally, the Whizlabs Course (and tests) is another great online resource, in particular for its practice exams that I found quite advanced. I would recommend doing those last.

AWS (free for most courses, paid for using AWS services)

There are many AWS online courses and white papers that you would want to read to familiarize yourself with the various AI/Data pipeline/Security services offered by Amazon; which are all part of the exam questions.

One way to go about it is to follow the “Developer” and “Data Scientists” tracks in the AWS Machine Learning online course library.

You can also read / watch / use in a notebook:

Additionally, there are some solid Amazon White papers that you may want to read to five deeper into various topics, such as:

Finally, while the exam may not quiz you on recently announced or released (recent means “within 6 months”) services, it would, however, benefit you to stay up to speed and aware of what AWS is releasing. One good way to do so is to attend or watch the AWS Re:invent articles and videos.

Stanford/Coursera Machine Learning and Deep Learning (paid ~50-70 hours)

To become introduced and have a solid understanding of Machine Learning and Deep Learning, I highly recommend the Coursera/Stanford courses from Dr. Andrew Ng below. I found them to be one of the best ML and Deep Learning online courses available today. Dr. Andrew is a great instructor and will go in great detail with some hand writing/drawing and storytelling to explain every important concept that you will be quizzed on such as regularization, hyper parameter tuning and so on:

YouTube (free)

There are several videos available on YouTube that thoroughly explain the fundamental concepts of ML as well as walking viewers through the full ML lifecycle, sharing how to utilize Amazon SageMaker, Jupyter notebooks, and more. Videos such as these could be helpful for those who may not be as “hands-on” as they would like to be. They can also get a better feel for the user experience and interface that ML engineers and Data scientists use. Here is a list of recommended videos, however, please note, this is by no means a complete list as videos are continuously being added every day:                      

Kindle (paid)

There are various books that will provide you additional information and some of them contain actual practice exams. One example is:

Build Your Own - Cheat Sheet

Since there are many articles, videos, and blogs easily searchable via Google, my recommendation is to look for and build out your own cheat sheet.  Once you feel you are ready to take the exam, simply reference your cheat sheet for a quick refresh.

I built my own cheat sheet based on the answers I got incorrect during practice exams as well as some subtle nuances and details I did not want to forget. I am sharing my cheat sheet as a PDF, however, keep in mind this is a highly personalized cheat sheet encapsulating my learning experience. I encourage you to build out your own.

3. Advice on taking practice exams for the Machine Learning AWS - specialty exam

To pass, you will need to take some practice tests. I highly encourage you to pass as many practice tests as possible, ideally several times each. Each of the courses mentioned above offers one or more practice exam(s) as part of the online course. Some may sell “extra” practice exams, which many people consider to be worth the fee.

Here are some pointers:

For reference, I’ve scored between 60% and 80% in my practice tests, eventually got close to 90%, but never above. Ultimately, I scored a 92% for the official exam. But one cannot over-prepare  — this is truly a difficult exam.

Logistics of taking the real Machine Learning AWS - specialty exam

When you are ready, you can schedule your exam here (MLS-C01). Due to Covid-19, you can complete this exam from your home.

This is what I did for my exam. They go through significant effort to maintain testing standards and replicate the in-person testing environment. They will ask you to take pictures, then show the room with your webcam. You will not be able to have any pen or paper. And the software that will run on your computer will essentially close all applications but itself and run full screen.

One tactical note that may be of importance (at least it was to me)  you cannot have any earplugs unless you make a specific request prior to the exam. Personally, that was a struggle for me as I was trying to fully focus on the exam without hearing any noise from my family.

Final thoughts on the Machine Learning AWS - specialty exam

I hope this article serves you well in your preparation for the Machine Learning Specialty certification. It has been exciting to be a part of the convergence of academia, tech, and business in the AI and ML space. I hope you continue on with your Machine Learning journey to better yourself personally and professionally.

Last but not least, I want to thank Capital One for being an amazing Technology driven company, focusing on doing the right thing for our customers and associates, as well as  supporting all associates to continually invest in themselves.

If you are looking for a place to innovate using the latest open source libraries and frameworks while leveraging the power of the Cloud, Big Data and AI, and if you value a supportive and collaborative environment, relentless innovation, and striving for excellence, look no further! Join us already! We are always hiring top notch talent. You can find exciting opportunities on our Capital One Careers portal.

Best of luck to you all!

--

People photo created by pressfoto- www.freepik.com


Fabrice Guillaume, Director, Data/ML Engineering - Center for Machine Learning

I am a Director of Data/ML Engineering at Capital One, at the Center For Machine Learning (C4ML). I have spent over 20 years getting dirty with data, in consumer and enterprise applications, and now focusing on growing and scaling Machine Learning capabilities leveraging the power of the Cloud. When I’m not wrangling, munging, or modeling data you can find me doing some ultra endurance sport (Ironman, Ultra marathon), snowboarding, mountain biking, or fishing with both my sons and wife!

Related Content