A data engineer designs and maintains an organization's entire data infrastructure, handling data extraction, transformation, loading, and processing across structured, semi-structured, and unstructured data types. Google Cloud Platform (GCP) provides comprehensive data engineering services including Google Cloud Storage for data lakes, BigQuery for fast querying and analytics, DataProc for big data processing, Dataflow for streaming and batch processing, and Cloud Composer for pipeline orchestration. GCP offers advantages such as petabyte bandwidth, flexible pricing, live migration capabilities, and inbuilt machine learning APIs, making it a preferred choice for modern data engineering workloads.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
GCP DATA ENGINEERING BY VIJAY SIR DURGASOFT YOUTUBE LIVE STREAMAdded:
Yes.
Yeah, good evening all. Right.
So, myself I'm Vijay. Like I hold 20 plus years of experience.
My experience including different technologies like big data, Hadoop, Spark, PySpark and this GCP data engineer, right? Yes.
So, before I start, can anyone please confirm whether my voice is clear, whether the screen is visible or not?
Yes.
Is my voice clear to all? Is my screen visible to all? Right, fine.
>> Yes.
Yes.
As part of today's demo, I'll be covering the following things, right?
>> [snorts] >> Who is a data engineer?
Means how a data engineer is different from a traditional ETL developer?
How is different from a traditional ETL developer? Data engineer roles and responsibilities.
What are the various data engineer roles and responsibilities?
And what is Google Cloud Platform?
Various GCP data engineering products or services.
Job market of GCP.
Yes, fine.
>> [snorts] >> So, previously, if you go back like 10 to 15 years, right? If you go back like 10 to 15 years, there is no concept of data engineer. There is no concept of data engineer.
We used to have ETL developer.
We used to have ETL developer, right?
Yes.
>> [snorts] >> But how a data engineer is different from a So, the tool here we are using GCP, the cloud GCP, right?
Yes. And if you talk advantage of cloud, if we talk if we're talking about a cloud, advantage of cloud over this on-premises.
So, here in the case of on-premises, we need to set up our own hardware infrastructure.
We need to set up our own hardware infrastructure.
On top of that, OS installation.
Purchasing of database license.
Purchasing of database license.
Designing of database objects. Designing of database objects.
Yes.
To maintain all this, we require separated teams.
To maintain all this, we require separated teams like hardware team is required.
Operating system team is required.
Networking team is required.
Security team is required.
Database administrators. But here, everything in the cloud premises. But here, everything in the cloud premises, we no need to maintain all this. No need of No need to maintain any infrastructure. No need to maintain any licenses.
Everything in the cloud. Here, everything in the cloud.
>> [snorts] >> Yes.
Advantage of cloud over this on-premises, right? Yes.
If you talk, we have got many cloud service providers. We have got many cloud service providers.
Okay. Now, see this various cloud service providers. AWS, Azure, GCP, Oracle Cloud, Salesforce, VMware, Liquid Web, Alibaba, Rackspace, Digital Ocean.
But most of the companies are preferring GCP, Azure, AWS, right? Yes.
Why GCP? Why we require GCP?
As compared with Azure or AWS.
See here.
GCP growth is increasing rapidly.
GCP growth is increasing rapidly.
Here, we store data.
Here, we retrieve or read data.
We perform data analytics.
We see high availability of data. We see high availability of data. Data is stored in different places.
If data is lost at one place, we can see the data in other, right?
>> [snorts] >> Data stored in different places.
If one server is down, we get data from other server.
Your data will be safe and secure.
Okay.
Then, easy to scale.
Easy to scale. We can increase or decrease as per our requirements.
Easy to scale. We can increase or decrease as per our requirement.
GCP offers wide scope of applications.
GCP offers wide scope of applications to run web applications, data analytics, machine learning, right? Yes.
Why to choose GCP?
Why to choose GCP as compared with Azure or AWS? Why I need to choose?
Means one, it's the fastest, biggest network in the world. It is the fastest, biggest network in the world.
Google has got data centers. Google has got data centers throughout the world.
They use high-end virtual machines. They use high-end virtual machines.
Petabyte bandwidth.
Petabyte bandwidth. High performance.
Separation of computer and the storage.
Every individual virtual machine will have its separate RAM and data storage.
Each cluster will have a number of nodes.
The number of nodes we can store data.
If data stored in single node, if it get crashed, then the data is lost.
So, in GCP, if if one copy is stored in one machine, other copies are stored across multiple machines.
Here, from different data centers, we can retrieve data.
From different data centers, we can retrieve data.
GCP has flexible pricing.
GCP has flexible pricing as compared with AWS and Azure.
Customization. See, even the flexible pricing, discounted prices also will be given for running l- bigger workloads.
For For example, for 5 years, 10 years.
Customization. Customization.
Wide range of customizations. For a particular instance, we can customize depending upon the requirement. Like for instance, having 1 petabyte of storage.
Scalability. Easy to scale.
Capacity increasing or decreasing.
Scalability capacity. Easy to scale in capacity increasing or decreasing. Live migration.
Here, we can have in GCP live migration.
Supports live migration of virtual machines.
Live migration of virtual machines.
Whatever existing system we have, whatever existing system we have, if you store data in if you have got data, you can lively migrate to GCP without interact without impacting the existing data.
Big data ML.
We get inbuilt and predefined ML and AI APIs in GCP, right? We have got many inbuilt and predefined ML and AI APIs in GCP.
See >> Free trials, some services free for 3 months.
Some services free for 1 year.
Some services free for lifetime.
Some services free for 3 months, some services free for 1 year.
Some services free for lifetime.
Customer systems migration. Customer systems migration.
Existing data migration to GCP. Existing system existing data migration to GCP.
Like we've got HSBC, Twitter, New New York Times, Spotify, Vodafone, all these things, right?
They're migrating to GCP.
So here I was talking about GCP, right?
Yes.
So this is that why we choose GCP. These are the major things we see.
The mainly it's the fastest, biggest network in the world. It has got petabyte bandwidth.
It has got flexible pricing. It has got scalability.
Live migration we can say live migration.
Free trials.
Customer systems migration, right? Yes.
Okay, data engineering involves data engineering, everything deals with the data.
Here in data engineering everything deals with the data.
Total world is flooded with user data. Using data only we perform everything.
Data.
Using data we perform data analysis.
Data has got experience. Data has got future reference.
Using data or data science we perform data analysis.
Okay. So here we perform using data engineering like data handling data, ingesting data, or extracting data, or storing data, filtering data, transforming data, processing data, merging data, sorting data, grouping data, analyzing data.
So these are the various things in data engineering, right? Mainly mean loading.
ETL Data engineering involves How how we extract the data? How to extract the data? Check it.
Means from where What are the sources from where we can extract the data from?
So what are the sources of the data?
From which sources we need to extract the data?
How to process the data?
And what is the destination of the data?
What is the destination of the data?
How to extract data? What are the sources of the data? How to process data?
And what is the destination of the data?
A data engineer designs and maintains an organization's entire data infrastructure.
Mhm.
Data engineers.
Data engineers with the help of cloud services they work on the data and make it structured and useful.
A data engineer is responsible for smooth data flow.
Smooth smooth data flow.
From source to destination.
Data engineer is responsible for smooth data flow from source to destination, right? If you talk Data How data engineer is different from a data scientist?
Data scientist is going to strategize for the future.
Analyze the data.
>> [snorts] >> Identify the key patterns.
Analyze the data.
Recommendations for the future.
Strategize the future.
Analyze the data, identify the key patterns, recommendations for the future.
Strategize for the future, analyze the data, identifying the key patterns, recommendations for the future.
A data scientist.
So he performs data scientist Data engineer.
Data engineer.
But data engineer.
Mainly for data storage plus data management.
Data scientist.
>> [snorts] >> Data scientist does many things, right?
Data scientist If you If you want data scientist.
Data analysis. He performs data analysis.
>> [cough] >> Excuse me.
Data predictions.
Data classification.
For data analysis, data predictions, data classification.
Data clustering.
Data recommendation.
So data analysis, data predictions, data classification, data clustering, data >> [snorts] >> Okay.
And data Okay.
Another side data analyst.
Summarization will revise the data of the past.
Data engineer for data management.
Data storage and data processing.
Sorry.
Again, data engineering for data management.
For data storage and data processing.
So, data engineer should be able to work with the structured data.
Should be able to work with semi-structured data. Should be able to work with unstructured data. Like he should be able to work with streaming data with data which keeps on incrementing, which keeps on generating.
Should be able to work with the no SQL data, not only SQL, right?
Schema-less behavior, random access.
>> [snorts] >> Okay.
You should be able to work and you will be able to work with all the file formats. You should be able to work with all the file formats.
Like CSV format, JSON format, parquet format, XML format. You should be able to work with all the format.
So, here if you see So, here data scientist, data engineering and data analyst, right?
Data engineer does all the groundwork Data engineer does all the groundwork for data scientist and analyst.
More than 60% of data analysis Data analysis work is done what?
Yes.
More than 60% of data analysis work Data engineers.
So, now coming to this GCP, right? Okay.
So, Google Cloud Platform.
It's a computing service.
It allows the customers to have PAS, Platform as a Service, Infrastructure as a Service and serverless compute.
>> [snorts] >> And the a typical data engineer A GCP data engineer A designs A GCP data engineer designs, builds and maintains data pipelines using Google Cloud.
>> [snorts] >> Is going to design, build and maintains data pipelines using Google Cloud. A GCP data engineer ensures data quality, optimizes data storage and process data processing, collaborating with the data scientists and analysts, monitoring data infrastructure.
Monitoring data infrastructure. See, GCP data engineer, right?
Is going to maintain the entire organization's data infrastructure, complex data pipelines, big data technologies.
And while if you talk about ETL developer focuses more on the extract ETL, extract, transform and load process.
Data engineer has a broader, more architectural role utilizing the big data platforms and cloud services, whereas the ETL developer, whereas the ETL developer has a more specialized role of using traditional ETL tools and this Okay.
>> [snorts] [snorts] >> Okay.
And what are the various GCP data engineering products and services?
Yes.
We have got uh Google storage for storing data. BigQuery for fast querying BigQuery, right?
For fast querying, for storage and analysis of large data sets.
We have got DataProc. We'll be discussing about DataProc for big data processing and analytics.
Data bricks for PySpark processing in GCP.
Data flow for stream and batch data processing.
And Cloud Composer, Apache Airflow for automating and monitoring data pipelines.
Pub/Sub for messaging and data ingestions.
You will get total idea, right? Once you can go through this course content once.
What are the things we'll be discussing?
You'll GCP data engineering module one, GCP data engineering fundamentals.
Google Cloud Storage, Cloud SQL, setting up a database.
BigQuery for building a for building data warehouse.
DataProc, big data processing.
Data bricks, PySpark processing.
Data flow, Apache Beam development.
Google Cloud Composer, orchestration.
Data fusion.
Data integration service.
DBT, data build tool.
Airflow and Terraform.
So, majorly these 10 things will be discussed, right? In brief.
Means if we see the introduction to this Google Cloud Platform, overview of cloud platforms, GCP data engineering fundamentals.
GCP getting started with GCP.
Creating GCP account.
Getting a new Google Google account using non-Gmail ID.
Overview of credits and overview of this price and billings.
Storage and databases in GCP.
Storage and databases in GCP, right?
Yes.
Cloud Storage, BigQuery, Cloud SQL, Cloud Spanner for relational databases. Firestore, BigTable for no SQL databases.
And data modeling best practices.
Google Cloud Storage, GCS.
Getting started with this data lake using GCS.
Google Cloud Storage, right?
Cloud SQL, set up PostgreSQL database using Cloud SQL.
Installing of this PostgreSQL, setting up tables in GCP.
Data ingestions and integration.
Pub/Sub, Data flow, DataProc.
Yes. Data processing and transformations.
Data warehousing and analytics, BigQuery.
BigQuery for building data warehouse.
DataProc.
DataProc.
Big data processing.
DataProc, big data processing.
ETL data pipelines using a data proc data bricks by spark processing in GCP with many working examples ETL pipelines using a data bricks integration of spark and data proc and big query integration of spark and data proc and big query data flow Apache beam development cloud pub sub Google cloud composure for data pipeline orchestration data fusion cloud functions terraform data pipelines using data build tool airflow and big query right these things machine learning in GCP security monitoring and governance real world GCP data engineering scenarios building a stream streaming pipeline building a batch pipeline data migration from on prem to GCP designing hybrid data lake big query data flex and GCS yes project close around two files I'll be discussing as part of this course next see offerings from my side if you see that what are the various offerings from my side you'll be getting each and every live session video means every session is a video recorded means if you miss any session you'll be getting that session video you can go through that and attend the next session for the better understandings soft copy of this class notes assignments are tasks to work a WhatsApp group for technical discussions interview questions around 200 interview questions live interactive hands-on sessions so all the sessions will be interactive at any point you can ask your queries if you want to do the things practically along with me you can do it Python videos apart from this right these are the offerings from my side along from that Durga soft offering right you'll be Python videos access will be provided around 100 Python videos MySQL videos around 35 videos and Linux videos around 100 videos right yes so you can just to go through even I'll be discussing as part of because basic Python and basic SQL are the prerequisites right for this course so that I'll be discussing about the Python syntaxes yeah [snorts] any other queries you got so this is just introduction to this what we'll be discussing and what you'll be learning as part of this course you will get great command again that data engineering is a very if you see the field data engineering field has got great demand data engineers and data scientists has got great demand in today right and that way if you see the salary packages right even if you see 1.5 times more than other technologies you can demand for the salaries so in the same way they'll be expecting the command from you in this or in this technical skills right yes so after the end of this course you a typical 3 plus 4 plus years experience how much knowledge is to gain that much of knowledge you are going to get it's totally like 10 to 12 weekends course around 3 months it takes 12 weekends course 12 weekends course Saturday for 2 hours Sunday for 2 hours totally like 4 hours goes like around 12 weeks it goes some of sometimes I'll be extending the sessions with one more hour any other queries you got anyone right sir you are covering the big query also right yes big query in brief right you can just check this one second means how to write the big queries to face the data this kind of thing right yes you can see in this integration of spark and data proc and big query and also big query for building data warehouse this entire thing you can check it right big query for building this entire thing okay okay means whatever is basic required you are going to provide in big query right yes yes at least we can understand the queries and we can write the big queries to face the data integrations also I'll be discussing okay so everything in brief that's why I'll be spending around 3 months time for this so clear notes will be provided the videos you'll be getting assignments and tasks assignment questions interview questions sir anyhow we can change the time or 5:00 p.m. is the the fixed it initially it will be 5:00 p.m.
for 2 3 weeks later it will be 6:00 to 8:00 it will be the timing 6:00 to 8:00 okay >> [snorts] >> okay tomorrow also you people can attend the session same time at 5:00 o'clock using the same link to see more discussion yes some yes some query was asked sir here you are covering CCD related thing any or pipeline settings yes okay okay tomorrow also same time just we'll have some introduction but from next week we'll be starting tomorrow just I'll be seeing I'll be discussing how we can interact with the services right what are the different methods of interacting with the services different ways of interacting so we have got multiple methods if you want to interact with through command based through console through SDK what are the different ways what are the differences you'll be observing so from the next week we'll be starting with the actual things right yes okay so if you're done with the queries if there are no other queries I'm signing off for now meet you tomorrow same time at 5:00 using the same link to see more discussion right yes thank you all for your time thank you bye thank thank you thank you thank you sir okay so yesterday we had introductory session in this GCP data engineer who is a data engineer data engineer roles and responsibilities GCP cloud platform various GCP data engineering products or services job market of GCP right job market of GCP okay so just I'll take 5 minutes what I discussed in the last class right advantage of advantage of cloud over on premises right why we are going for cloud as compared with on premises we need to set up our own hardware infrastructure we need to set up our own hardware infrastructure on top of that OS installation we need to set up our own hardware infrastructure and on top of that OS installation right yes purchasing of database license designing a database objects to maintain all this again that we require separate teams like hardware team operating system team networking team security team database administrators but here everything in the cloud premises you no need to maintain all this no need to maintain any infrastructure no need to maintain any license everything in the cloud and everything in the cloud right various cloud service providers AWS Azure GCP Oracle cloud salesforce VMware liquid web Alibaba cloud rackspace digital ocean right most companies are preferring GCP Azure AWS why GCP as compared with Azure or AWS means GCP growth is increasing rapidly from the past 1 year if you see as compared with other clouds GCP growth is increasing rapidly.
It is the biggest network.
It is the fastest growing network. Here, we store data.
We retrieve our data, data. We perform data analytics. We see high availability of the data. If the data is lost, again we can retrieve from different other regions, right?
Data is safe and data is secure.
Easy to scale.
GCP is flexible.
Why to choose GCP?
Fastest, the biggest network in the world. GCP is the fastest, biggest network in the world.
Petabyte bandwidth.
Separation of compute and storage. Separation of compute and storage.
Flexible pricing.
Customization.
Scalability. Easy to scale. You can increase or decrease as per requirement.
We can do a lot of customizations, I said.
As per your requirement. Flexible pricing. Even a lot of discounts are applied if you're going with a huge workloads.
Live migration. In some of the other clouds, it's not available. In GCP, we can migrate.
Big data and machine learning. Lot of inbuilt We have got many APIs, inbuilt APIs are available towards big data and machine learning, right?
Free trials. Some services are free for lifelong. Some free are some are free for 1 year.
Some are free for 3 months.
Customer systems migrations.
So, these [snorts] are the things, right? Which makes GCP popular.
If you see it, previously I said we have got ETL developer. As compared with the traditional ETL developer with data engineer right? Data engineer has got a lot of responsibilities.
The data engineering involves handling the total everything. He deals with the data. He keeps on dealing with data, filtering or grouping or merging or sorting or loading of this data, right?
So, all this he'll be handling all this, handling data, ingesting data, filtering data, transforming, processing, merging, or sorting the data.
Analyzing. Data engineering involves how to extract the data.
What are the sources of the data?
And how to process the data.
And what is the destination of the data?
A data engineer designs and maintains A data engineer designs and maintains an organization's data infrastructure.
He designs and maintains an organization's entire data infrastructure.
Data scientist strategizes for the future. Data scientist strategizes for the future.
Data analyst summarizes and visualizes the data of the past.
See, data scientist, right? He has got lot of other responsibilities like performing data analysis, data predictions, data classifications, clusterings, recommendations for the future. Data analyst summarizing, visualizing the data of the past. Data engineer, mainly for data management, data storage, and data processing. Data engineering, right? Data engineer should be working with all varieties of data.
He should work with structured data. He should work with semi-structured. He should work with unstructured.
He should work with streaming data. He should work with no sequel data.
With all file formats like CSV, JSON, Parquet, XML, right? Yes.
Okay.
GCP, coming to this GCP Cloud, right?
Say, computing service.
It is allows customers to have platform as a service, infrastructure as a service, serverless computes.
So, generally data engineer should state of quality.
Optimizing data storage and data processing.
Collaborating with the data scientist and data analyst to to provide as per the requirements, right?
Monitoring data infrastructure.
Monitoring that data is available at each phase of this process. ETL, extracting, transforming, and loading.
This is the key task of a data engineer.
Where he's [snorts] going to pull the data from different sources.
And he was going to transform and load it.
Various GCP products or services.
Various GCP products or services.
Mhm. [snorts] So, I was talking about various GCP products or services that I'll be discussing as part of this course.
Go- Google Cloud Storage for storing the data.
BigQuery for fast querying, storage, and analysis of large data sets.
>> [clears throat] [snorts] >> Fast querying, storage, and analysis of large data sets.
DataProc for big data processing and analytics.
And Databricks PySpark processing in GCP.
PySpark, very high-speed processing execution model for processing huge data at very quick time, right? Yes. Dataflow for streaming data which keeps on generating. Data which keeps such data we call it as streaming data.
And batch data.
And batch and batch data processing.
B- Okay. Batch data means batch processing means millions and trillions of, right? User non-interactive applications.
Batch applications, we say.
Online processing, batch processing.
Online processing means user interactive.
Batch means user non-interactive.
Any online application, user compulsory interacts by providing the details.
But batch, there won't be any user involvement. There won't be any input of the user.
Cloud Composer, Apache Airflow for automating and monitoring data pipelines.
Pub/Sub for messaging and for data ingestions.
Okay. So, here if I talk about this mainly like PySpark execution. PySpark is 100 times faster.
PySpark is 100 times faster than normal disk processing.
Normal disk processing, right? So, there was a test conducted, right? There was a test done.
Yahoo has taken a table of 100 TB consisting of 1,024 columns.
Task, sorting based on 16 columns.
Sorting based on 16 columns.
Time taken by Oracle has taken 3.5 days.
MySQL, it has taken around 6 days.
Teradata, it has taken 4.5 hours to process this 100 TB data, right? So, Teradata, Netezza has taken just 3 hours of time.
See, this Teradata Netezza specialized database is used for data warehouse.
Hadoop, big data Hadoop has just taken 3.4 minutes.
Means as compared with this where they have taken days of time, this have taken hours of time.
But it has taken minutes of time.
With Spark Spark, PySpark, right?
It's 100 times faster.
Already Hadoop has got very high speed.
So, Spark is 100 times faster than Hadoop.
Yes.
So, if I see, if you want to interact with the GCP services, if you want to interact, just I'll give some introduction to this. How we can interact with the GCP services? Many are there. GCP products I was talking about this GCP products, right? This one.
Cloud Storage, Cloud SQL, BigQuery, Cloud DataProc.
Mhm.
All these things. [clears throat] If you want to interact with this, we have [snorts] got interacting interact with GCP services. We got majorly four methods.
>> [snorts] >> But okay.
>> Interacting with GCP services.
Majorly three methods are there.
Fourth method also available, but I'll discuss it separately.
First one, using the console, >> [snorts] >> using the cloud shell, software development kit, SDK.
Using the console, using cloud shell, using cloud SDK.
>> [clears throat] >> Using console, using console if I talk, using console, right?
A simple way.
Just if you observe here, just I'll show you.
Yes.
>> [snorts] >> So, here >> [snorts] >> console.cloud.google.com.
So, select that any One second.
Okay, password.
See, console.cloud.google.com is this.
To open this uh console.cloud.google.com.
Whenever you are taking a free account, right? So, it is going to provide you free accounts.
Some of services are free for life long, some are free for 1 year, some are free for 3 months.
So, yes, I'm great.
So, this is cloud console.
This what welcome to this cloud console, right? Yes.
Okay.
This is the console. If you This is what console where you want to open anything from here. You want to see this navigation menu where you can see anything here.
All these components.
This is what the project area.
This what If you want to For example, cloud storage.
Cloud storage if you see.
Mhm.
Now, can you see buckets?
Any buckets available you can see.
Many buckets are available.
These are the various buckets that we created already.
So, yes, if you want to go with cloud storage or if you want to check for cloud SQL, anything if you want to open, you can open from this console only. Any service, any product or any component, right? Cloud SQL.
If you want to see cloud SQL from here itself, you can open cloud SQL.
BigQuery.
Anything.
My SQL instance, post grace instance which I created.
Yes.
So, understand here.
Using the console, using the console.
This is what I was talking about console. Console is what? What is your understanding here?
Console is a web-based a web-based You're not insta- Okay, it's a web-based graphical user interface, interface.
A web-based UI.
Using console, >> [snorts] >> we interact with GCP services.
Okay.
Here Here in console, we create a resource.
We create a resource.
You can delete a resource.
We delete a resource.
We update a resource.
We update a resource.
Apply or revoke permissions.
Using console, you can perform all this.
Okay, I'll come back to the cloud shell, but the difference between the difference between a service resource.
What do you mean by service and resource?
Service A simple way.
Service resource.
>> [cough] >> Service is offered by GCP.
A resource created by user.
So, service is offered by GCP, resource not like that.
Service is offered by GCP, resource created by user.
We have many services. We have many services.
Example, BigQuery is there.
BigQuery is a service.
Under this, we create a resource.
Under this, we create a we create resources.
Like table one, table two, so on.
>> [snorts] >> GCS.
Under this, GCS means Google Cloud Storage.
Like bucket one, bucket two, bucket three.
>> So on. So in this way, service is offered by GCP, resource is created by the user.
Okay, in this way many are there.
Under this service, we create resources.
Next.
>> [snorts] >> Offered by GCP.
For their own business purpose.
But resources are resources you are creating.
You are creating, right?
For your own project purpose.
For your own project purpose.
Like tables, buckets, all this you are creating for your own project purpose.
Okay.
Next is Cloud Shell.
The second purpose, second way is what?
Cloud Shell. Up to now what? Cloud Console.
Means just uh you can interact. Next is using Cloud Shell.
It is a web-based command-line interface.
It is a web-based command-line interface.
Previous one, console, it is a web-based GUI, but it is web-based command-line interface.
Here we use commands.
>> [snorts] >> Here we use commands. Okay.
>> [snorts] >> Okay, for example, how to work with Cloud Shell? How to work with Cloud Shell? Just observe how to work with Cloud Shell.
Just observe. If I click on this >> [snorts] >> Cloud Google Cloud.
This is the home page.
Here you can see activate Cloud Shell. Activate Cloud Shell.
Terminal.
This is what? Whenever you say activate Cloud Shell, this is going to be open.
Here you can work with anything. Here you can work with the command-based. This is command-based.
BigQuery LS I'm saying. BigQuery LS I'm saying.
In the BigQuery LS it is if it is got anything it will display, otherwise empty it will show. Empty it will show if there is nothing.
Mhm, take some time.
Mhm.
No, nothing is there in that. Now, for example, GSUtil GSUtil list I'm saying. GSUtil list.
It will list all the buckets.
Google storage utility like list. It will display all the buckets.
See, multiple buckets are available. I already created it.
These are all the buckets. Buckets to display all the buckets.
Here we are using commands to interact with GCP services.
LS command, RM command, Linux-based commands we'll be using.
Here if you want you can check it. You can work with Python.
Python you can say enter.
Python Shell 3.12 will be open. If you want to work with any valid Python statements.
23 + 42.
Gives the result.
>> [snorts] >> Control L.
Any valid Python statements. X as 10, Y as 4.
What is X I'm asking? What is Y? What is X + Y?
What is X - Y? X into Y?
X by Y?
Yes.
X into 2.
X squared, 10 squared, any valid Sorry, X squared.
X squared.
10 squared, 100.
Control L.
Mhm.
>> [snorts] >> Okay.
>> [clears throat] >> If you want to come out of this, say quit. Otherwise, if you want X equal to hello.
X into 3 prints hello for three times.
Any valid statements you can work on this Python statements.
If you say Python {hyphen} {hyphen} version, it will display the Python version.
Mhm.
Java version.
Java {hyphen} {hyphen} version.
Displays Everything are available.
Everything are Everything are already available.
You no need to install these things.
So here everything are available.
Python Yes, we can create buckets from the shell also.
You can create create from the buckets in the shell.
You can check it in the console. You can create in the console. You can check it in this shell. Also, you can create from the Cloud SDK also. It is where you will be installing in your local machine.
In your In your Windows system, you'll be installing the Cloud SDK. From there also you can create.
>> [snorts] >> From there also you can create and you can create a buckets. You can check from there.
And also you can go after creating. I'll create one bucket from console. I'll create one bucket from console. I can create one from shell. I can create one from SDK. Yes, multiple buckets I can create, but everything you can check it.
You'll be able to view it.
Yes. Okay, so sir So the most recommended way is to use console.
Console will be like GUI-based it will be, but command command way also it will be very simple only single command.
So creating bucket directly like create option will be there. If you click on that, it will ask for the name. You will give some name and say create. It gets created.
Command also it will be very simple creating bucket in single line.
Anyway anyway, depending upon that, right? Programmatically if you want to go, you go for this command-based.
Okay.
Fine.
So that way what I'm saying about how we can interact with the using a command-based you can, otherwise GUI-based console. GUI-based console or command-based CLI. You can use any of them. Everything are available here. And again, if we if we talk about this CLI again, this it is of two types.
If you observe.
Cloud shell and Cloud shell terminal and editor. Terminal and editor, two are there. Now, this is terminal.
You want to say open editor, you can also open editor editor.
Open editor.
This is an editor.
Just like VS Code it will be it will be just like VS Code.
You know, see if you have idea about VS Code, this like VS Code.
If you want to create any file and if you want to execute here I can right click.
I can just click on this.
Want to create a file, I'll create a file.
123.py demo.py I'm saying 123.py just X as 10 Y as 20 print X + Y. I can type here. I can write some code within this.py file.
I can execute it.
>> [laughter] >> Next So sir, basically this GCP is you know, it is like a one place destination where you can code you can deploy and you can run it. Is it like that?
Yes, GCP everything you can perform here within the cloud environment.
Okay, even the coding?
Yes, everything you can do it.
Okay.
So if I want to push my code, where I can push my code in GCP?
I'll discuss that.
One second.
So here, we have got many things, right?
Many things, many components, many services. If you want to work with the things, right? One by one I'll be discussing.
So here I was talking about this console shell, right? Where you can interact with any services here from if you want to interact with the Cloud SQL or Cloud Storage or BigQuery, anything, right? You can just Okay.
Now See the difference between these two.
See the difference between these two.
It is a cloud-based CLI.
Used commands to interact with the services.
Even SDK also, the next one.
The third one Cloud SDK also it is command-based only. But what is the difference between Cloud Shell and Cloud SDK?
Cloud Shell and Cloud SDK, right?
Yes.
In Cloud Shell, it is web-based.
Cloud SDK is not web-based.
SDK is not web-based. It is local-based. In your local machine you will install and you will work with it.
Just what is it?
You will just install in your local machine and you will work with it. It is local-based. It's not web-based. But Cloud Shell and Cloud console, both are web-based. But this is not web-based. It is local. We'll be installing. But what all the commands we are using in this Cloud Shell for interacting with the various services the command same commands you can use to work with Cloud SDK.
It is also command-based. Here also we used commands for interacting with the services.
Here also we'll be using commands to interact with the services.
So this are the interacting Cloud SDK, Cloud Shell. Cloud SDK we have I'll be I'll be showing you step by step how to download and how to install and how to work on it.
More on this. But the Cloud Shell, Cloud SDK, just I was talking about shell.
This command line interface where you can write and work with the commands.
Yes. So once you can check with that, right? What I'm going to discuss as part of it.
See, GCP data engineering fundamentals, Google Cloud Storage Cloud SQL setting up database, BigQuery for building data warehouse.
One second.
DataProc for big data processing, DataBricks for PySpark processing, DataFlow for Apache Beam development Google Cloud Composer for orchestration DataFusion for data integration DataBuildTool, Airflow and Terraform will be discussing. These 10 things, right? In brief we'll be discussing.
Introduction to this Cloud Platform, GCP data engineering fundamentals, starting with GCP, getting a free account, billing, project creation storage and databases in GCP, working with the Google Cloud Storage, setting up a data lake using GCS Cloud SQL setting up Postgres database using Cloud SQL data integrations and data ingestions data processing and transformations data warehousing and analytics using like BigQuery BigQuery for building data warehouse in brief we'll be discussing, right?
Yes, DataProc for big data processing.
DataProc for big data processing and see ETL data pipelines using DataProc.
And PySpark processing in GCP.
ELT pipeline using DataBricks.
Integration of Spark on DataProc and BigQuery.
DataFlow Apache Beam development.
Batch and full streaming process.
Pub/Sub Google Cloud Composer for data pipeline orchestration.
DataFusion Cloud Functions Terraform Data pipelines using DB2, Airflow and BigQuery.
Machine learning in GCP.
Real world GCP data engineering scenarios, building streaming pipeline building a batch pipeline, data migrations on-prem to GCP.
So if you want to migrate from on-premises to GCP, designing a hybrid data lake BigQuery data lakes in GCS, right? Yes.
So this what in brief in depth we'll be discussing, right? You can just check with this offerings from my side.
Each and every session video you will be getting, soft copy of the class notes assignments and tasks to work with, a WhatsApp group where you can directly interact with me.
Interview questions around 200 live interactive hands-on sessions Python videos, my And from DurgaSoft site they'll be the prerequisites like basic Python, like basic SQL, basic Linux, right?
So for that it is providing 100 videos on Python. You can just check with that.
Some 30 or 40 videos on this MySQL.
Some 100 videos on Linux, right? You can just check with that.
It is providing free access to this videos, MySQL, Linux and Python once you get enrolled to this course.
So everything from the prerequisites, everything I'll be discussing.
So from the next week we'll be starting with the actual things.
The getting the subscription, working with the practical part.
So next week the link will be changing.
You are supposed to get enrolled to get the new link.
You can see in the Zoom chat a big message is given by this online team.
You can work with that.
You can just check. You can just copy that big message, right?
I don't see any message, sir.
One second.
Okay. So the duration of the one they'll be posting now.
So the duration of this course will be 12 weekends, 3 months.
So 2 hours on Saturday, 2 hours on Sunday.
Initially for the first 3 4 weeks still be at 5:00 to 7:00.
Later it will be 6:00 p.m. to 8:00 p.m.
So like 2 hours in the evening Saturday evening and 2 hours in the Sunday evening.
It goes like around 12 weekends it goes.
We'll be discussing about all this practically.
Those who wants to do the things practical practical things if you want to do parallel along with me during the session, you can do it.
If any errors in the settings or if you want to creating anything, right?
You can just share the screens. And still if any errors while practicing, no need to wait until the next week.
You can just WhatsApp group is available. You can post that errors. You can take snaps of it and post in the WhatsApp group. They'll be handled.
They'll be answered, right? Yes.
So, those who doesn't have anyhow, I'll be discussing the Python syntaxes, Linux syntaxes, right? Before I go. Not But still if you want to have to have command or by the next week we'll be starting with this one week if you are free, just if you doesn't have knowledge, you can just go with that.
Get enrolled and get the videos of this Python, MySQL, and Linux.
And go through those if you doesn't have knowledge of those, right?
Yes. Now, Okay.
So, you will get a mail, right? You will get a mail about this uh uh course content details, payment details, all this. Get enrolled for getting this video access.
You'll be getting clear notes.
Yes. Next week, the link will be changing. You are supposed to get enrolled for getting the paid link, right? To your mail.
Daily daily videos will be added to your Google Drive. Notes you'll be getting to your mail.
Yes.
Okay.
So, we'll start from the next week with the actual things, right? Yes. So, they'll forward actually they didn't post here.
Second.
They'll post they'll provide you through mail, right? With whatever mail you have joined the meeting to that mail you'll be getting, right? The details about the payment. They'll There is a note given in that Okay, you can just check with that all the whatever the things.
Fine.
So, getting this a free subscription for 3 months.
Within the 3 months you can work with freely. Don't take any paid subscription.
Free subscription is more than enough.
26,000 credits will be added.
With the 26,000 credits you can just use all that products, premium products also, services also can be used.
Okay.
Okay, fine. So, we'll meet on next Saturday. More discussion on this.
We'll continue the course from there, right?
Thank you for your time. Thank you. Meet you on next Saturday.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
>> Mhm.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











