All Projects → Leverege → Gcp Data Engineer Exam

Leverege / Gcp Data Engineer Exam

Study materials for the Google Cloud Professional Data Engineering Exam

Projects that are alternatives of or similar to Gcp Data Engineer Exam

blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-60.42%)
Mutual labels:  gcp, data-engineering, google-cloud-platform
awesome-bigquery-views
Useful SQL queries for Blockchain ETL datasets in BigQuery.
Stars: ✭ 325 (+125.69%)
Mutual labels:  gcp, data-engineering, google-cloud-platform
All About Programming
Everything about programming!!
Stars: ✭ 314 (+118.06%)
Mutual labels:  gcp, google-cloud-platform
Cloud Functions Go
Unofficial Native Go Runtime for Google Cloud Functions
Stars: ✭ 427 (+196.53%)
Mutual labels:  gcp, google-cloud-platform
Secrets Store Csi Driver Provider Gcp
Google Secret Manager provider for the Secret Store CSI Driver.
Stars: ✭ 40 (-72.22%)
Mutual labels:  gcp, google-cloud-platform
Gcpsketchnote
If you are looking to become a Google Cloud Engineer , then you are at the right place. GCPSketchnote is series where I share Google Cloud concepts in quick and easy to learn format.
Stars: ✭ 2,631 (+1727.08%)
Mutual labels:  gcp, google-cloud-platform
rowy
Open-source Airtable-like experience for your database (Firestore) with GCP's scalability. Build any automation or cloud functions for your product. ⚡️✨
Stars: ✭ 2,676 (+1758.33%)
Mutual labels:  gcp, google-cloud-platform
Firebase Gcp Examples
🔥 Firebase app architectures, languages, tools & some GCP things! React w Next.js, Svelte w Sapper, Cloud Functions, Cloud Run.
Stars: ✭ 470 (+226.39%)
Mutual labels:  gcp, google-cloud-platform
vertex-ai-samples
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud
Stars: ✭ 270 (+87.5%)
Mutual labels:  gcp, google-cloud-platform
Fog Google
Fog for Google Cloud Platform
Stars: ✭ 83 (-42.36%)
Mutual labels:  gcp, google-cloud-platform
Forseti Security
Forseti Security
Stars: ✭ 1,179 (+718.75%)
Mutual labels:  gcp, google-cloud-platform
Gcp Service Broker
Open Service Broker for Google Cloud Platform
Stars: ✭ 133 (-7.64%)
Mutual labels:  gcp, google-cloud-platform
mlops-with-vertex-ai
An end-to-end example of MLOps on Google Cloud using TensorFlow, TFX, and Vertex AI
Stars: ✭ 155 (+7.64%)
Mutual labels:  gcp, google-cloud-platform
restme
Template to bootstrap a fully functional, multi-region, REST service on GCP with a developer release pipeline.
Stars: ✭ 19 (-86.81%)
Mutual labels:  gcp, google-cloud-platform
gSlack
Get Slack notifications from Google Cloud Platform
Stars: ✭ 69 (-52.08%)
Mutual labels:  gcp, google-cloud-platform
plantuml-libs
A set of PlantUML libraries and a NPM cli tool to design diagrams which focus on several technologies/approaches: Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), C4 Model or even EventStorming and more.
Stars: ✭ 75 (-47.92%)
Mutual labels:  gcp, google-cloud-platform
Terracognita
Reads from existing Cloud Providers (reverse Terraform) and generates your infrastructure as code on Terraform configuration
Stars: ✭ 452 (+213.89%)
Mutual labels:  gcp, google-cloud-platform
Unity Solutions
Use Firebase tools to incorporate common features into your games!
Stars: ✭ 95 (-34.03%)
Mutual labels:  gcp, google-cloud-platform
associate-cloud-engineer
Resources on preparing for Google Cloud Associate Cloud Engineer certification
Stars: ✭ 142 (-1.39%)
Mutual labels:  gcp, google-cloud-platform
Google-Cloud-Study-Jams
Resources for 30 Days of Google Cloud program workshops and events conducted by GDSC VJTI
Stars: ✭ 13 (-90.97%)
Mutual labels:  gcp, google-cloud-platform

Google Cloud - Professional Data Engineer Exam Study Materials

Several engineers at Leverege recently studied for and passed the Google Cloud Professional Data Engineer certification exam. The exam not only covers Google's flagship big data and machine learning products (e.g. BigQuery, BigTable, Cloud ML Engine), but also tests you on your ability to analyze and design for data engineering problems. While we had experience with many of the GCP products tested on the exam, more studying was necessary to encompass the entire scope of the exam. We have put together a collection of study materials that we used to prepare for the exam. We hope that our study guides help you pass your exam on your first try!

Exam Format

Google Cloud Professional Data Engineer Exam consists of 50 multiple choice questions. You have two hours to complete the exam at a certified test location. It's important to note that paper and pencils are not allowed in the exam. We highly suggest going through the official practice exam without writing anything down to simulate an actual test environment. Some questions will ask you to pick multiple answers, but the prompt will also let you know how many correct answers there are. For example, there may be a question on what types of machine learning algorithms you can use to given a dataset, and given six choices, you pick three correct options.

We found the official practice exam to be similar in difficulty as the actual exam. The practice test included at the end of Preparing for the Google Cloud Professional Data Engineer Exam on Coursera was also helpful to see what types of questions might be asked. We also took the 50-question practice exam on the Linux Academy course, but found it a bit misleading in terms of question style. A sample LinuxAcademy question might ask what GCP product to use when you have an existing Hadoop cluster, but most of the actual exam questions had a customer scenario and focused on designing the solution rather than simply picking a product. It was still good a resource to gauge your pace, however, as the other practice tests are only 25 questions each.

Study Plan

Since this is a GCP Data Engineering exam, it's imperative to know the key GCP products. It's best to get hands on experience by completing the Data Engineering track on Qwiklabs, but if you are pressed for time, you can read our Data Engineering Notes or other cheatsheets compiled by jorwalk and ml874.

Once you are familiar with the GCP products, it's good to study up on the Hadoop ecosystem (e.g. Hadoop, Hive, Spark) and its GCP equivalent as well as key ML concepts. There were no in depth questions on TensorFlow, machine learning, or deep neural networks, but the exam did test on feature engineering strategies (e.g. how to combat overfitting) and identifying potential machine learning questions to solve.

There were several questions on the two case studies listed on the website (i.e. Flowlogistic, MJTelco), but the questions did not require you to re-read the actual case study again during the exam. All of the information needed to answer the question regarding the case studies was embeded within the question itself. We suggest going through the video on the Coursera course to dissect the case studies, but not memorize or over analyze this portion too much.

Overall, there was a heavy emphasis on design, troubleshooting, and optimization of various data engineering scenarios. A common type of problem was asking how to re-design an existing solution at scale or implementing a fix given issues in the current architecture.

For example:

  1. A current Cloud SQL implementation has a single table with a few data points. In the future, if the throughput is 100x higher, how can you partition/shard the tables to improve performance?
  2. You need to design a global e-commerce application to that can deal with multiple customers trying to buy the same item around the same time. How do you deal with out of order data?
  3. A BigQuery command is taking too long to read/compute/write. How do you change your query to fix this?

There were also a fair number of IAM-related questions, consistent with the types of questions on the practice exam. It was really helpful to review all the IAM roles per product, knowing the different role types assigned to a human user vs a Service Account, and encryption strategies:

  1. Give an external consultant access to DataFlow/BigQuery/BigTable, what role would you assign without giving access to the actual data?
  2. Customer wants to encrpyt data at rest, but doesn't want to store the keys on GCP. Where should you create keys, and how do you encrypt that data?

Finally, not all questions were scenario-based. There were a fair number of questions simply asking for product specific details that tested on core concepts of GCP products:

  1. How to design the BigTable index to improve performance.
  2. How to avoid exploding index problem for DataStore.
  3. What combination of GCP products to use for streaming data and storage.
  4. Given technical requirements, what open-source Hadoop products to use to process/store data.

Each exam probably draws from a larger pool of questions so it's hard to be definitive about what topics to study more, but it's fair to expect more questions on BigQuery, Machine Learning, BigTable, and DataFlow than Cloud SQL, PubSub, or Stackdriver. Some members on our team mentioned a question or two about regulatory requirements (e.g. HIPAA, GDPR), but nothing too specific.

We hope that the exam study guide we prepared and used also helps you pass the test. If you notice any inaccuracies or want to contribute, feel free to leave a issue!

Resources

Cheatsheets:

Other Exam Overviews/Debriefs:

Courses:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].