Nowadays technology is emerging and we are seeing a lot of updates in software industries. Most people are doing and learning ordinary technology and this is the reason they are not succeeding in their careers. Especially for university and college students, it is highly recommended to learn skills and new technologies if they want to start high paid career after their college or university is done. I have written down this article for the Azure Data Engineer learning path.
Introduction of today’s article
Today I have collected the information and did some research to show you a path for learning the skills which will make you Azure Data Engineer. The average salary for an Azure Data Engineer is $101728 per year in the US according to glassdoor. Which is around 2 crore Rs per year according in Pakistani rupees and 82 lacs in Indian Rupees. I always recommend and urge all of you to do remote jobs from your country in Europe or US to earn more dollars and support your country’s economy.
Below is the complete path of how to become an Azure Data Engineer.
Skill Required to become Azure Data Engineer
|Azure Storage Account||Basic|
|Azure Data Factory||Intermediate|
|Azure Data Bricks||Intermediate|
|DataWarehouse and Data|
|Basic +Intermediate(Good to|
Python – Intermediate
• Control Statement
• Condition Statement
• OOPS Concept
• Exception Handling
• File I/O
• DML,DCL,DDL COMMANDS
• All types of constraint
• INNER,LEFT,RIGHT,SELF,CROSS,FULL OUTER JOIN
• WINDOW FUNCTIONS -ROW NUMBER,RANK AND DENSE RANK, ETC
• CTE,VIEW, STORED PROCEDURE
• INDEXING- CLUSTER INDEX AND NON CLUSTERED INDEX
• HANDS-ON SQL QUERY QUESTION
• How many types of services Azure provides?
• What is the difference between IAAS, PAAS, and SAAS?
• What is VM?
• What is resource group, Tenant, and Subscription in Azure?
Azure Storage Account
• What is a blob? why use it?
• How many blobs does azure provides?
• What is ADLS GEN2 why use it ?
• Basics of Table, Queue, and File.
• What is AzureDataFactory
• What is IR?
• What are linked services and datasets?
• Multiple types of triggers
• Different types of activities and their uses
• What is Spark?
• What is DBFS?
• Architecture of Spark and its advantages
• What are the scope and mount point
• What is the key vault
• Basics of PySpark
• Hands-on loading and writing data in different formats
• How many types of clusters do we have and which one was to use
• PartitionBy, Repartition and Coalesce
• Broadcast variable and Accumulator
• Different types of Joins
• Different types of file formats and differences among them
• When to do cache and persist during configuration
• Create Dimension and Fact table using PySpark or SPARK SQL
• What are Managed tables and External tables
• What are delta tables?
• What is SCD?
• Setup End to End Job using Job Workflow or ADF pipeline
Data modeling and DataWarehouse Concept
• What is data modeling?
• Explain various types of data models
• Explain the fact and fact table
• Differentiate between the fact table and dimension table
• Differentiate between data warehouse and database
• Difference between Database, Datawarehouse, and DataMart
• List out various design schemas in data modeling
• What is an in-memory analytics
• What is the difference between OLTP and OLAP?
• What are the different types of cardinal relationships?
• What is the difference between Star schema and Snowflake schema?
• What is POWER BI?
• What is Report?
• What is Dashboard?
• What are Measures?
• What are calculated columns?
• How to create a relationship between tables in Powerbi?
• Which chart/visual should be used in the given dataset
• Design PowerBI report using dummy dataset(Ex. Covid-19 dataset)
• Defintion of logic App?
• How to use the logic app in ADF?
• Hands-on on small automation task to send emails to users when the ADF job
• What is AzureDevops?
• What is CI-CD?
• What is a service connection?
• What is Artifact and build?
• Hands-on on ADB CI-CD pipeline