Study for the Google Cloud Professional Data Engineer Exam with engaging Qandamp;A. Each question features hints and detailed explanations to enhance your understanding. Prepare confidently and ensure your success!

Practice this question and more.


What could cause files to not be discovered in Dataplex?

  1. You have an exclude pattern that matches the files.

  2. You have scheduled discovery to run every hour.

  3. The files are in ORC format.

  4. The files are in Parquet format.

The correct answer is: You have an exclude pattern that matches the files.

The reason that having an exclude pattern matching the files would cause them to not be discovered in Dataplex is that exclude patterns are specifically designed to filter out certain files from being processed. When a file matches the exclude pattern defined in the Dataplex configuration, it will not be considered in the discovery process, effectively preventing it from being available for querying or analytics. This ensures that any files that are deemed unnecessary or irrelevant for the intended analysis can be ignored, maintaining efficiency and relevance in data processing. In contrast, the frequency of scheduled discovery is not a limiting factor for file discovery; files will still be discovered during each scheduled run unless they meet an exclusion criterion. Furthermore, while files being in ORC or Parquet format does not inherently prevent their discovery, it is important to note that both formats are commonly supported by systems that engage in analytical processing, including Dataplex. Therefore, the format of the files alone would not result in them being undiscoverable unless other specific conditions, such as exclusion patterns, are in play.