Which of the following are considered monitoring tools and metrics? (Select all that apply)
- A . CPU Usage
- B . Load Balancer Configuration
- C . Number of Employees in the Organization
- D . Network Latency
Which technique for Root Cause Analysis involves asking "why" repeatedly to delve deeper into the underlying causes of an incident?
- A . Ishikawa diagram (Fishbone diagram)
- B . 5 Whys technique
- C . Fault Tree Analysis
- D . Incident Escalation Matrix
When troubleshooting a performance issue, which technique helps identify the slowest part of a system?
- A . Load testing
- B . Stress testing
- C . Profiling
- D . Code review
What is a typical toolchain used in Incident Management?
- A . Incident detection, post-mortem analysis, monitoring
- B . Ticketing system, change management, customer communication
- C . Root cause analysis, automation, backup system
- D . Disaster recovery, service catalog, service level agreement (SLA) management
Which incident management team role manages the investigation, communication, and resolution of major incidents?
- A . Incident Commander
- B . First Responder
- C . Subject Matter Expert
- D . Site Reliability Engineer
What is Problem Management? (Select all that apply)
- A . Identifying root causes of incidents
- B . Ensuring that incidents are managed correctly
- C . Proactively preventing incidents from occurring
- D . Escalating incidents to upper management
Which of the following describes an IBM Cloud Code Engine Build?
- A . Creates container images from the source code
- B . Serves HTTP requests and has a URL for incoming requests
- C . Runs one or more instances of the executable code
- D . Manages resources and provides access to its entities
What is the primary objective of disaster recovery planning?
- A . To prevent all types of disasters from occurring
- B . To minimize the impact of a disaster on the organization
- C . To eliminate the need for data replication
- D . To achieve 100% uptime for all services
How can information be found for troubleshooting? (Select all that apply)
- A . Reviewing system logs and error messages
- B . Consulting technical documentation and knowledge bases
- C . Ignoring historical data to focus on the current problem
- D . Interviewing team members and stakeholders
What is the key advantage of canary deployment compared to other zero downtime deployment models?
- A . It allows for instant rollbacks in case of issues
- B . It requires no additional infrastructure setup
- C . It eliminates the need for automated testing
- D . It ensures that all users experience the new version simultaneously
How can SRE roles and responsibilities be applied to application deployments?
- A . SREs are not involved in application deployments
- B . SREs handle all aspects of application development and deployment
- C . SREs focus on the reliability and availability of applications during deployment
- D . SREs are responsible for development only and not deployment activities
Which of the following is NOT a recommended troubleshooting technique?
- A . Monitoring system logs for errors and warnings
- B . Collecting data systematically and thoroughly
- C . Asking for help and collaboration with teammates
- D . Randomly making changes to system configurations without analysis
What are basic skills and habits of effective troubleshooting? (Select all that apply)
- A . Making assumptions without analyzing the data
- B . Keeping detailed documentation of troubleshooting steps
- C . Relying solely on intuition and gut feeling
- D . Using systematic approaches to isolate and fix problems
When is it appropriate to use alerts in monitoring systems?
- A . Only during planned maintenance windows
- B . Only during business hours
- C . When a service violates its defined SLO
- D . When engineers want to test the alerting system
What is IBM Cloud Code Engine and how does it work?
- A . A cloud-based IDE for software development
- B . A serverless platform for running containerized applications
- C . A machine learning service for data analysis
- D . A virtual machine hosting service
What is the primary goal of Problem Management?
- A . Restoring services to normal operation as quickly as possible
- B . Identifying the root causes of incidents and addressing underlying issues
- C . Communicating with customers about ongoing incidents
- D . Documenting incidents for legal purposes
Which troubleshooting technique involves isolating the problematic component by dividing the system in half and checking each part?
- A . Binary search
- B . Active probing
- C . Traceroute
- D . Post-mortem analysis
What are data storage replication concepts for high availability and disaster recovery? (Select all that apply)
- A . Backup and restore of virtual machines
- B . Synchronous and asynchronous data replication
- C . Using RAID (Redundant Array of Independent Disks) for replication
- D . Cross-region data replication for disaster recovery
What are user-related security policies designed to do?
- A . Restrict access to system administrators only
- B . Promote sharing of sensitive information among users
- C . Ensure that all users have the same level of privileges
- D . Define rules and guidelines for user access and behavior
Which process describes people using an instant-messaging communication platform to collaborate among subject matter experts?
- A . SlackOps
- B . ChatOps
- C . DevOps
- D . SREOps
Which technique is commonly used for conducting a Root Cause Analysis?
- A . Change management
- B . Incident escalation
- C . Fishbone diagram
- D . Service catalog management
What is the definition of "high availability" in the context of service architecture components?
- A . The ability of a system to handle an increasing number of users
- B . The capability of a system to recover from failures and continue functioning
- C . The practice of running multiple instances of a service to balance the load
- D . The process of ensuring that all data is consistently backed up
Which of the following is a characteristic of IBM Cloud Code Engine?
- A . It requires manual scaling of applications based on traffic demands
- B . It is limited to a specific programming language for applications
- C . It only supports applications in a specific region of the IBM Cloud
- D . It abstracts away infrastructure management for serverless applications
Which of the following best describes rank-ordered actions in Incident Management?
- A . Taking actions based on the impact and severity of the incident
- B . Implementing actions without considering their priority or importance
- C . Randomly selecting actions from a predefined list
- D . Ignoring the severity of incidents and focusing on quick fixes
In the context of monitoring and observability, what are "SLOs"?
- A . Specific Logging Observations
- B . Service Level Outcomes
- C . Service Level Objectives
- D . System Logging Operations
Which technique aims to ensure a system remains reliable during peak traffic or high load situations?
- A . Failover
- B . A/B testing
- C . Capacity planning
- D . Blue-green deployment
What are the three components of the SRE Error Budget Policy? (Select all that apply)
- A . Service Level Objectives (SLOs)
- B . Service Level Agreements (SLAs)
- C . Service Level Indicators (SLIs)
- D . Service Level Targets (SLTs)
What are the key tenets of Incident Management? (Select all that apply)
- A . Prioritizing low-impact incidents over high-impact ones
- B . Incident detection and alerting
- C . Incident response and mitigation
- D . Incident escalation and resolution
What are the use and application of metrics in monitoring? (Select all that apply)
- A . Measuring system performance and health
- B . Evaluating user satisfaction with the service
- C . Defining Incident Management workflows
- D . Tracking project timelines and deadlines
What are the benefits of implementing resiliency techniques in a system?
- A . Decreased system complexity and lower costs
- B . Improved customer experience and increased user engagement
- C . Reduced need for monitoring and observability tools
- D . Increased number of features and functionalities
In the context of Incident Management, what is the purpose of a Post-Incident Review (PIR)?
- A . Assigning blame to individuals responsible for the incident
- B . Documenting the incident for legal purposes
- C . Identifying the root cause of the incident and learning from it
- D . Informing customers about the resolution of the incident
What is the primary goal of troubleshooting in the context of SRE?
- A . Assigning blame for incidents
- B . Identifying the root cause of issues and resolving them
- C . Ignoring incidents and waiting for them to resolve on their own
- D . Documenting incidents for legal purposes
Which deployment strategy allows a new version of software to be released to a small subset of users before a full rollout?
- A . Canary deployment
- B . Blue-green deployment
- C . Rolling deployment
- D . Feature-flag deployment
What is the main purpose of Security Information and Event Management (SIEM) solutions?
- A . To replace firewalls and antivirus software
- B . To monitor user activities without their knowledge
- C . To centralize and correlate security events and logs
- D . To enforce strict access controls for all users
Which are various troubleshooting techniques? (Select all that apply)
- A . Rebooting the system as the first step
- B . Using debugging tools and diagnostic commands
- C . Guessing the root cause without investigation
- D . Analyzing system metrics and performance data
When troubleshooting block storage issues, what should be checked first?
- A . The CPU utilization of the VM hosting the block storage
- B . The data center location of the block storage
- C . The network connectivity between the VM and the block storage
- D . The physical condition of the storage disk
What is the first step to recognizing security issues in a system?
- A . Waiting for an incident to occur before taking any action
- B . Regularly conducting security audits
- C . Ignoring security alerts and notifications
- D . Monitoring system logs and analyzing anomalies
What is one of the primary benefits of effective troubleshooting in SRE?
- A . Increased mean time to repair (MTTR) for incidents
- B . Decreased stress and workload for SRE team members
- C . Avoidance of any incidents or issues in the system
- D . Relying solely on intuition rather than systematic approaches
What is the relationship between Incident Management and Problem Management processes?
- A . Incident Management is a subset of Problem Management
- B . Problem Management is a subset of Incident Management
- C . Both processes are unrelated and operate independently
- D . Incident Management and Problem Management work together to resolve incidents and identify underlying issues
Which zero downtime deployment model involves running multiple versions of the application and gradually shifting traffic to the new version?
- A . Canary deployment
- B . Blue-green deployment
- C . Rolling deployment
- D . Feature flag deployment
How does IBM handle security incident response management?
- A . By ignoring security incidents and focusing on other tasks
- B . By outsourcing incident response to third-party companies
- C . By promptly detecting, analyzing, and mitigating security incidents
- D . By involving only the IT department in incident response
What are the benefits of monitoring and observability? (Select all that apply)
- A . Identifying system bottlenecks and performance issues
- B . Preventing all incidents from occurring
- C . Making informed decisions based on data
- D . Eliminating the need for incident management
Which types of monitoring are commonly used to observe a system? (Select all that apply)
- A . Black-box monitoring
- B . White-box monitoring
- C . Performance monitoring
- D . Intrusion monitoring
The four golden signals used for monitoring and observability are:
- A . Uptime, Response Time, CPU Usage, Network Bandwidth
- B . Latency, Traffic, Errors, Saturation
- C . Requests Per Second, Memory Usage, Disk Space, Response Time
- D . Availability, Throughput, Utilization, Latency
When a metric is at or above the warning threshold, but below the critical threshold, the metric is in which state?
- A . Alert
- B . Warning
- C . Concern
- D . Pre-critical
Which of the following is a recommended practice for securing applications on IBM Cloud?
- A . Storing all sensitive data in plain text
- B . Using a single shared account for all team members
- C . Implementing multi-factor authentication (MFA)
- D . Allowing anonymous access to resources
What is the primary difference between blue-green deployment and rolling deployment?
- A . Blue-green deployment requires manual intervention, while rolling deployment is fully automated
- B . Rolling deployment switches traffic between two versions, while blue-green deployment keeps both versions running simultaneously
- C . Blue-green deployment is suitable for small applications, while rolling deployment is ideal for large applications
- D . Rolling deployment is slower than blue-green deployment in terms of releasing new features
When considering resource utilization for monitoring, what does "capacity planning" involve?
- A . Estimating the number of engineers required for incident management
- B . Forecasting the amount of resources needed to meet performance requirements
- C . Allocating budget for purchasing new monitoring tools
- D . Identifying critical incidents based on historical data
Which of the following is a key tenet of Incident Management?
- A . Minimizing customer communication during incidents
- B . Focusing on quick fixes rather than long-term solutions
- C . Prioritizing incidents based on the impact on the organization
- D . Assigning blame to individuals responsible for incidents
What is the role of logs in troubleshooting for SRE?
- A . Logs are not useful for troubleshooting and can be ignored
- B . Logs help engineers understand system behavior and diagnose issues
- C . Logs are useful only for developers and have no impact on SRE activities
- D . Logs are used for monitoring system performance but not for troubleshooting