Databricks Databricks Certified Data Engineer Associate Databricks Certified Data Engineer Associate Exam Online Training
Databricks Databricks Certified Data Engineer Associate Online Training
The questions for Databricks Certified Data Engineer Associate were last updated at Feb 18,2025.
- Exam Code: Databricks Certified Data Engineer Associate
- Exam Name: Databricks Certified Data Engineer Associate Exam
- Certification Provider: Databricks
- Latest update: Feb 18,2025
A data engineer has left the organization. The data team needs to transfer ownership of the data engineer’s Delta tables to a new data engineer. The new data engineer is the lead engineer on the data team.
Assuming the original data engineer no longer has access, which of the following individuals must be the one to transfer ownership of the Delta tables in Data Explorer?
- A . Databricks account representative
- B . This transfer is not possible
- C . Workspace administrator
- D . New lead data engineer
- E . Original data engineer
A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following commands could the data engineering team use to access sales in PySpark?
- A . SELECT * FROM sales
- B . There is no way to share data between PySpark and SQL.
- C . spark.sql("sales")
- D . spark.delta.table("sales")
- E . spark.table("sales")
Which of the following commands will return the location of database customer360?
- A . DESCRIBE LOCATION customer360;
- B . DROP DATABASE customer360;
- C . DESCRIBE DATABASE customer360;
- D . ALTER DATABASE customer360 SET DBPROPERTIES (‘location’ = ‘/user’};
- E . USE DATABASE customer360;
A data engineer wants to create a new table containing the names of customers that live in France.
They have written the following command:
A senior data engineer mentions that it is organization policy to include a table property indicating that the new table includes personally identifiable information (PII).
Which of the following lines of code fills in the above blank to successfully complete the task?
- A . There is no way to indicate whether a table contains PII.
- B . "COMMENT PII"
- C . TBLPROPERTIES PII
- D . COMMENT "Contains PII"
- E . PII
Which of the following benefits is provided by the array functions from Spark SQL?
- A . An ability to work with data in a variety of types at once
- B . An ability to work with data within certain partitions and windows
- C . An ability to work with time-related data in specified intervals
- D . An ability to work with complex, nested data ingested from JSON files
- E . An ability to work with an array of tables for procedural automation
Which of the following commands can be used to write data into a Delta table while avoiding the writing of duplicate records?
- A . DROP
- B . IGNORE
- C . MERGE
- D . APPEND
- E . INSERT
A data engineer needs to apply custom logic to string column city in table stores for a specific use case. In order to apply this custom logic at scale, the data engineer wants to create a SQL user-defined function (UDF).
Which of the following code blocks creates this SQL UDF?
A)
B)
C)
D)
E)
- A . Option A
- B . Option B
- C . Option C
- D . Option D
- E . Option E
A data analyst has a series of queries in a SQL program. The data analyst wants this program to run every day. They only want the final query in the program to run on Sundays. They ask for help from the data engineering team to complete this task.
Which of the following approaches could be used by the data engineering team to complete this task?
- A . They could submit a feature request with Databricks to add this functionality.
- B . They could wrap the queries using PySpark and use Python’s control flow system to determine when to run the final query.
- C . They could only run the entire program on Sundays.
- D . They could automatically restrict access to the source table in the final query so that it is only accessible on Sundays.
- E . They could redesign the data model to separate the data used in the final query into a new table.
A data engineer runs a statement every day to copy the previous day’s sales into the table transactions. Each day’s sales are in their own file in the location "/transactions/raw".
Today, the data engineer runs the following command to complete this task:
After running the command today, the data engineer notices that the number of records in table transactions has not changed.
Which of the following describes why the statement might not have copied any new records into the table?
- A . The format of the files to be copied were not included with the FORMAT_OPTIONS keyword.
- B . The names of the files to be copied were not included with the FILES keyword.
- C . The previous day’s file has already been copied into the table.
- D . The PARQUET file format does not support COPY INTO.
- E . The COPY INTO statement requires the table to be refreshed to view the copied rows.
A data engineer needs to create a table in Databricks using data from their organization’s existing SQLite database.
They run the following command:
Which of the following lines of code fills in the above blank to successfully complete the task?
- A . org.apache.spark.sql.jdbc
- B . autoloader
- C . DELTA
- D . sqlite
- E . org.apache.spark.sql.sqlite