Why is it important to create hypotheses about the behavior of the AI system?
- A . It simplifies the coding process for developers
- B . It helps predict and mitigate potential risks associated with the system
- C . It is primarily for marketing purposes
- D . It fulfills a legal requirement for AI development
In the context of machine learning, what does the term ‘model drift’ refer to?
- A . The migration of the model from one server to another
- B . The change in model parameters due to new updates
- C . The change in model performance due to changes in underlying data patterns
- D . The physical movement of the hardware running the model
Which approach is recommended for prioritizing business opportunities when planning an MVP?
- A . Choosing the most straightforward implementation irrespective of impact
- B . Assessing the potential return on investment and strategic fit
- C . Prioritizing based on the preference of the project manager
- D . Focusing solely on technological innovation
What is the primary purpose of monitoring a model in production?
- A . To enhance the visual appeal of the model’s output
- B . To ensure the model’s performance remains stable over time
- C . To reduce the model’s complexity for easier understanding
- D . To increase the model’s training speed
Which of the following are essential tasks when preparing data for exploratory analysis? (Choose Three)
- A . Labeling data accurately
- B . Ensuring data is representative of the entire population
- C . Assigning random values to missing data points
- D . Anonymizing sensitive information
- E . Organizing data chronologically
In the context of classification, what does the term ‘overfitting’ refer to?
- A . The model performs equally well on the training and test datasets
- B . The model performs poorly on both training and test datasets
- C . The model performs too well on the training dataset but poorly on unseen data
- D . The model requires too much time to train due to large dataset
Which of the following are considered direct effects of an AI solution? (Choose Two)
- A . Enhancements in process efficiency for which the AI was designed
- B . Increased job satisfaction among employees not using the AI directly
- C . Reduction in operational costs due to automation
- D . New market opportunities stemming from the innovation
How do you assess the feasibility of an AI solution?
- A . By evaluating the available technology and resources
- B . By creating detailed financial models only
- C . By ensuring the project is the top priority of the organization
- D . By hiring external consultants to validate the solution
What is the primary use of the WHERE clause in an SQL query?
- A . To specify which columns to retrieve
- B . To limit the data that fits certain conditions
- C . To identify the tables involved in the query
- D . To denote the end of the SQL query
What is the first step in aligning on user intents for an AI solution?
- A . Prototyping the solution
- B . Identifying key stakeholders
- C . Conducting a market analysis
- D . Documenting technical requirements
How does IBM Garage Methodology suggest measuring success for an MVP?
- A . By the number of features implemented
- B . Through stakeholder satisfaction and feedback
- C . By comparing the MVP to competitor products
- D . Solely by financial metrics achieved
In assessing progress on the AI Ladder, which aspects should be considered? (Choose Two)
- A . The quality and accessibility of data
- B . The color palette of the user interface
- C . Integration capabilities with existing systems
- D . Branding and marketing strategies
When monitoring models in production, what aspect is crucial for maintaining long-term reliability?
- A . Regularly updating the user interface
- B . Ensuring the model is scalable to handle increased loads
- C . Reducing the number of inputs to the model
- D . Focusing solely on increasing model speed
Which feature engineering technique can be used to simplify models and improve interpretability?
- A . One-hot encoding categorical variables
- B . Normalizing continuous variables
- C . Removing correlated features
- D . Increasing the number of features
How does feature scaling benefit the process of exploratory data analysis?
- A . It changes the underlying data distribution
- B . It makes different variables comparable
- C . It simplifies the database management system
- D . It eliminates the need for data cleaning
For implementing dimensional reduction, which method would be most effective when dealing with highly nonlinear data?
- A . Linear Discriminant Analysis (LDA)
- B . PCA
- C . t-Distributed Stochastic Neighbor Embedding (t-SNE)
- D . Factor Analysis
What are two reasons a data point would be treated as an outlier?
- A . If the value is greater than mean
- B . If the value is greater than median
- C . If the value is greater than standard deviation
- D . If the value is below the upper end of the bottom quartile by more then 1.5 times the interquartile range
- E . If the value is above the lower end of the top quartile by more then 1.5 times the interquartile range
Which practice is least effective in configuring environments for training machine learning models?
- A . Using virtual environments to manage dependencies
- B . Using the latest but unstable software versions
- C . Regularly updating libraries to their stable versions
- D . Allocating resources based on model requirements
Why is logistic regression considered a linear classifier?
- A . Because it is only capable of linear regression tasks
- B . Because it uses a linear decision boundary to separate classes
- C . Because it applies a nonlinear transformation to the input features
- D . Because it computes the decision boundary using a non-linear optimization
What considerations should be made when evaluating the ethical implications of a business problem? (Choose Three)
- A . Potential for AI to replace human jobs
- B . Environmental impact of AI solutions
- C . Impact on company profit margins
- D . Consequences for user privacy and autonomy
- E . Speed of implementation
Which of the following is a common use case for recommendation engines?
- A . Predicting property prices
- B . Detecting fraudulent credit card transactions
- C . Suggesting products to customers based on past purchases
- D . Categorizing news articles into topics
You need to compare sales performance across different regions.
Which type of chart would most effectively serve this purpose?
- A . Histogram
- B . Box plot
- C . Bar chart
- D . Heatmap
If the goal is to explore the central tendency and variability of a dataset, which types of plots would be most informative?
- A . Bar chart and line plot
- B . Histogram and box plot
- C . Scatterplot and heatmap
- D . Pie chart and line plot
Which approach is best for refining an AI solution based on feasibility assessment?
- A . Increasing the complexity of the solution
- B . Reducing scope to match available resources and capabilities
- C . Outsourcing the entire project
- D . Ignoring feasibility concerns to speed up deployment
In SQL, how would you extract the ‘name’ and ‘age’ columns from a table named ‘customers’?
- A . SELECT name, age FROM customers;
- B . EXTRACT name, age FROM customers;
- C . GET name, age IN customers;
- D . PULL name, age OUT OF customers;
Which algorithm is most appropriate for non-linear classification problems?
- A . Linear regression
- B . Logistic regression
- C . Support Vector Machine with non-linear kernels
- D . K-means clustering
Which approach would not be suitable for assessing model fairness?
- A . Analyzing confusion matrices for different subgroups
- B . Using the same performance metric for all models
- C . Conducting audits on model decisions
- D . Implementing external fairness monitoring tools
Which techniques ensure a model can explain its decisions and predictions?
- A . Implementing deep learning models exclusively
- B . Using highly non-linear models without any simplification
- C . Integrating explanation frameworks like LIME or SHAP
- D . Minimizing the use of regularization techniques
Converting a neural network into the newest version of TensorFlow or another deep-learning package is what type of performance drift or software decay?
- A . Data changes
- B . Concept drift
- C . Software changes
- D . Sampling bias and selection bias changes
How can ensemble modeling improve machine learning performance?
- A . By simplifying the models to reduce computation time
- B . By combining multiple models to reduce variance and bias
- C . By using a single, highly accurate model
- D . By focusing exclusively on increasing model accuracy
Which is a primary goal of AI design thinking in relation to business problems?
- A . Maximizing the speed of development
- B . Understanding user needs and pain points
- C . Choosing the most advanced AI technologies
- D . Ensuring the project is completed under budget
What is the primary goal when splitting data into training, testing, and validation sets?
- A . To increase the computational speed of model training
- B . To ensure the model generalizes well to new data
- C . To use all available data for training to improve accuracy
- D . To test all models on the same set of data
Which method in Pandas would you use to rename the columns of a DataFrame?
- A . df.rename_columns()
- B . df.columns = [‘new_name1’, ‘new_name2’]
- C . df.rename({‘old_name’: ‘new_name’}, axis=1)
- D . df.set_names([‘new_name1’, ‘new_name2’])
In K-Nearest Neighbors (KNN), what does K represent?
- A . The number of clusters to form
- B . The number of training samples to use
- C . The number of features to consider
- D . The number of nearest neighbors to consider
What is the benefit of feature scaling in model training?
- A . It increases the number of features for better accuracy
- B . It helps algorithms converge faster by normalizing feature magnitudes
- C . It decreases the transparency of the model
- D . It is only useful for unsupervised learning
Given an SQL table ‘Books’ with fields Title, Author, and Genre, which query would return a list of unique Genre values?
- A . SELECT Genre FROM Books;
- B . SELECT SET Genre FROM Books;
- C . SELECT UNIQUE Genre FROM Books;
- D . SELECT DISTINCT Genre FROM Books;
In the context of anomaly detection, what is the algorithm primarily searching for?
- A . Patterns that do not conform to expected behavior
- B . The best way to group similar data points
- C . The optimal number of clusters in the data
- D . The strongest predictors of a target variable
What are effective strategies for handling missing data? (Choose Two)
- A . Deleting all rows with any missing values
- B . Imputing missing values using statistical methods
- C . Using a machine learning model to predict missing values
- D . Ignoring missing data during analysis
Principal Component Analysis (PCA) is a common technique for which of the following?
- A . Regression
- B . Classification
- C . Clustering
- D . Dimensional reduction
For a model that needs to explain itself, which methods could be appropriately used? (Choose Three)
- A . LIME (Local Interpretable Model-agnostic Explanations)
- B . SHAP (SHapley Additive exPlanations)
- C . Embedding model parameters directly into the user interface
- D . Feature importance scores
- E . Randomizing input features to test output variation
A Logistic Regression algorithm is used to classify images into four categories.
If each image has a 5×5 pixel dimension, what is the the number of weights required (excluding biases) for this model?
- A . 200
- B . 100
- C . 300
- D . 400
Which type of plot would best illustrate the distribution of a single continuous variable?
- A . Line plot
- B . Bar chart
- C . Histogram
- D . Scatterplot
If you are using Python for data visualization, which library would you select to create a violin plot?
- A . Pandas
- B . Matplotlib
- C . Seaborn
- D . Plotly
Which technique is typically used to prevent overfitting in a decision tree classifier?
- A . Increasing the depth of the tree indefinitely
- B . Using a linear kernel instead of a polynomial kernel
- C . Pruning the tree to remove non-significant branches
- D . Applying PCA before training the classifier
What should be assessed to gauge progress in collecting data?
- A . The aesthetic appeal of data visualizations.
- B . Compliance with data protection and privacy laws.
- C . Speed of the oldest computer systems.
- D . Number of printers in the office.