Free Databricks Certified Data Engineer Professional Exam Databricks-Certified-Professional-Data-Engineer Exam Practice Test
Databricks-Certified-Professional-Data-Engineer Exam Features
In Just $59 You can Access
- All Official Question Types
- Interactive Web-Based Practice Test Software
- No Installation or 3rd Party Software Required
- Customize your practice sessions (Free Demo)
- 24/7 Customer Support
Total Questions: 120
-
A data engineer is configuring a pipeline that will potentially see late-arriving, duplicate records.In addition to de-duplicating records within the batch, which of the following approaches allows the data engineer to deduplicate data against previously processed records as it is inserted into a Delta table?
Answer: C Next Question -
A Delta Lake table representing metadata about content posts from users has the following schema:user_id LONG, post_text STRING, post_id STRING, longitude FLOAT, latitude FLOAT, post_time TIMESTAMP, date DATEThis table is partitioned by the date column. A query is run with the following filter: longitude < 20 & longitude > -20Which statement describes how data will be filtered?
Answer: D Next Question -
What statement is true regarding the retention of job run history?
Answer: C Next Question -
Which configuration parameter directly affects the size of a spark-partition upon ingestion of data into Spark?
Answer: A Next Question -
A production cluster has 3 executor nodes and uses the same virtual machine type for the driver and executor.When evaluating the Ganglia Metrics for this cluster, which indicator would signal a bottleneck caused by code executing on the driver?
Answer: E Next Question -
A CHECK constraint has been successfully added to the Delta table named activity_details using the following logic:A batch job is attempting to insert new records to the table, including a record where latitude = 45.50 and longitude = 212.67.Which statement describes the outcome of this batch insert?
Answer: B Next Question -
A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.Which situation is causing increased duration of the overall job?
Answer: D Next Question -
Which is a key benefit of an end-to-end test?
Answer: A Next Question -
Review the following error traceback:Which statement describes the error being raised?
Answer: E Next Question -
Which statement describes Delta Lake optimized writes?
Answer: A Next Question
Total Questions: 120
