Databricks Databricks Machine Learning Professional Databricks Certified Machine Learning Professional Online Training
Databricks Databricks Machine Learning Professional Online Training
The questions for Databricks Machine Learning Professional were last updated at Apr 18,2025.
- Exam Code: Databricks Machine Learning Professional
- Exam Name: Databricks Certified Machine Learning Professional
- Certification Provider: Databricks
- Latest update: Apr 18,2025
Which of the following operations in Feature Store Client fs can be used to return a Spark DataFrame of a data set associated with a Feature Store table?
- A . fs.create_table
- B . fs.write_table
- C . fs.get_table
- D . There is no way to accomplish this task with fs
- E . fs.read_table
Which of the following operations in Feature Store Client fs can be used to return a Spark DataFrame of a data set associated with a Feature Store table?
- A . fs.create_table
- B . fs.write_table
- C . fs.get_table
- D . There is no way to accomplish this task with fs
- E . fs.read_table
Which of the following operations in Feature Store Client fs can be used to return a Spark DataFrame of a data set associated with a Feature Store table?
- A . fs.create_table
- B . fs.write_table
- C . fs.get_table
- D . There is no way to accomplish this task with fs
- E . fs.read_table
Run a statistical test to determine if there are changes over time
Which of the following should be completed as Step #3?
- A . Obtain the observed values (actual) feature values
- B . Measure the latency of the prediction time
- C . Retrain the model
- D . None of these should be completed as Step #3
- E . Compute the evaluation metric using the observed and predicted values
Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?
- A . All of these reasons
- B . JS is not normalized or smoothed
- C . None of these reasons
- D . JS is more robust when working with large datasets
- E . JS does not require any manual threshold or cutoff determinations
A data scientist is utilizing MLflow to track their machine learning experiments. After completing a series of runs for the experiment with experiment ID exp_id, the data scientist wants to programmatically work with the experiment run data in a Spark DataFrame. They have an active MLflow Client client and an active Spark session spark.
Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark DataFrame?
- A . client.list_run_infos(exp_id)
- B . spark.read.format("delta").load(exp_id)
- C . There is no way to programmatically return row-level results from an MLflow Experiment.
- D . mlflow.search_runs(exp_id)
- E . spark.read.format("mlflow-experiment").load(exp_id)
A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the original model object.
Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?
- A . mlflow.load_model(model_uri)
- B . client.list_artifacts(run_id)["feature-importances.csv"]
- C . mlflow.sklearn.load_model(model_uri)
- D . This can only be viewed in the MLflow Experiments UI
- E . client.pyfunc.load_model(model_uri)
Which of the following is a simple statistic to monitor for categorical feature drift?
- A . Mode
- B . None of these
- C . Mode, number of unique values, and percentage of missing values
- D . Percentage of missing values
- E . Number of unique values
Which of the following is a probable response to identifying drift in a machine learning application?
- A . None of these responses
- B . Retraining and deploying a model on more recent data
- C . All of these responses
- D . Rebuilding the machine learning application with a new label variable
- E . Sunsetting the machine learning application
A data scientist has computed updated feature values for all primary key values stored in the Feature Store table features. In addition, feature values for some new primary key values have also been computed. The updated feature values are stored in the DataFrame features_df. They want to replace all data in features with the newly computed data.
Which of the following code blocks can they use to perform this task using the Feature Store Client fs?
A)
B)
C)
D)
E)
- A . Option A
- B . Option B
- C . Option C
- D . Option D
- E . Option E