Note: This question is part of series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a table named Table1 that contains 3 billion rows. Table1 contains data from the last 36 months.
At the end of every month, the oldest month of data is removed based on a column named DateTime.
You need to minimize how long it takes to remove the oldest month of data.
Solution: You specify DateTime as the hash distribution column.
Does this meet the goal?
Answer : B
Explanation:
A hash-distributed table distributes table rows across the Compute nodes by using a deterministic hash function to assign each row to one distribution.
Since identical values always hash to the same distribution, the data warehouse has built-in knowledge of the row locations. SQL Data Warehouse uses this knowledge to minimize data movement during queries, which improves query performance.
Note: A distributed table appears as a single table, but the rows are actually stored across 60 distributions. The rows are distributed with a hash or round-robin algorithm.
References: https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute
Note: This question is part of series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a table named Table1 that contains 3 billion rows. Table1 contains data from the last 36 months.
At the end of every month, the oldest month of data is removed based on a column named DateTime.
You need to minimize how long it takes to remove the oldest month of data.
Solution: You implement range partitioning based on the year and the month.
Does this meet the goal?
Answer : A
Explanation:
The data from the same time period would be in the same partition. This would it faster to remove one month of data.
Note: This question is part of series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a table named Table1 that contains 3 billion rows. Table1 contains data from the last 36 months.
At the end of every month, the oldest month of data is removed based on a column named DateTime.
You need to minimize how long it takes to remove the oldest month of data.
Solution: You implement round robin for table distribution.
Does this meet the goal?
Answer : B
Explanation:
A distributed table appears as a single table, but the rows are actually stored across 60 distributions. The rows are distributed with a hash or round-robin algorithm.
A round-robin distributed table distributes table rows evenly across all distributions. The assignment of rows to distributions is random. Unlike hash-distributed tables, rows with equal values are not guaranteed to be assigned to the same distribution.
References: https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute
Note: This question is part of series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are troubleshooting a slice in Microsoft Azure Data Factory for a dataset that has been in a waiting state for the last three days. The dataset should have been ready two days ago.
The dataset is being produced outside the scope of Azure Data Factory. The dataset is defined by using the following JSON code.
Answer : B
Explanation:
Unless a dataset is being produced by Azure Data Factory, it should be marked as external.
References: https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/data-factory/v1/data-factory-json-scripting-reference.md
Note: This question is part of series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are troubleshooting a slice in Microsoft Azure Data Factory for a dataset that has been in a waiting state for the last three days. The dataset should have been ready two days ago.
The dataset is being produced outside the scope of Azure Data Factory. The dataset is defined by using the following JSON code.
Answer : B
Explanation:
Unless a dataset is being produced by Azure Data Factory, it should be marked as external.
References: https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/data-factory/v1/data-factory-json-scripting-reference.md