Install our extension to search inside any video instantly.

Prometheus Monitoring Architecture Explained | Pull vs Push #prometheus #interviewquestions
Added: 2026-05-26

100 views525:22technicalcloudknowledgeOriginal Release: 2026-05-26

Prometheus uses a pull model for metric collection instead of push because it provides easier service discovery through Kubernetes API integration, enables centralized scraping of all services, offers better health monitoring by detecting target failures, simplifies debugging, and avoids scalability issues that occur when thousands of targets push metrics simultaneously. The pull model works by having Prometheus periodically scrape metrics from HTTP endpoints exposed by targets, with a default scrape interval of 15-30 seconds, while the push model creates problems including difficulty detecting target failures, metric cleanup challenges, and scalability limitations.

[00:00:02]Hey everyone this is Shikh again from Technical Cloud Knowledge and I hope you all are doing great. So as you all can see today we will cover interview questions on Prometheus.

[00:00:13]So what is our first question? Why does Prometheus pull metrics instead of push?

[00:00:22]First of all, let me tell you that this is a very important and tricky question which can be asked to you in an interview related to Cromethus like Prometheus or say monitoring.

[00:00:32]So you should have knowledge of this because most of the people get confused while answering this question.

[00:00:40]Ok?

[00:00:41]So what has been asked in the question in a simple way is that brother, why does our Prometh server, that is, our monitoring server, pull the matrix? Why doesn't it push? Isn't it? So, this is the question and as we can expect what is its expected answer? Written Prometheus mainly uses a pull model. Because why? The first reason is EGR service discovery. The second reason is centralized scraping. The third reason is BetterHelp monitoring. The fourth reason is simple debugging.

[00:01:13]Fifth Reason is Targets Expose Matrix on Methods. After that it was told that look, these are the benefits of the bridge. Then further explained push model create issues. If you do push model then it creates issues. The first is Hard to Know if Target Is Alive.

[00:01:27]Second, metric cleanup becomes difficult, and third, scalability problems.

[00:01:36]Now look at the answer and question, if we read it normally then we will understand it automatically. Isn't it? We can understand it without understanding its practical meaning. Isn't it? But if we understand it from a real time scenario then it will be easier for us to understand what is basically asked in this question and how to answer it in front of an interviewer. Isn't it? So look, let us understand this in a better way. Isn't it? One more thing, we will understand this entire concept of the entire interview question by taking the scenario of real time industry which will be easier for us. To understand.

[00:02:16]Ok? So let's say this is one of your Prometheus servers. Ok?

[00:02:24]By default, we know that the Prometheus server has three major components. And what are those three major components?

[00:02:34]Look, the first component is Prometheus's data retrieval function.

[00:02:45]What happens? Data retrieval work. Second, which is its component, is a time series database. The third component is the Prometheus UI, its web console user interface. Ok?

[00:03:07]So these are the three major components of Prometheus.

[00:03:10]Now all three have their own work. If we first talk about our Whose Data Retrievable worker, what does it do? Basically it helps us to scrape the matrix.

[00:03:21]Whose matrix?

[00:03:24]If we need to scrape the metrics of the application or code, who helps us in that?

[00:03:28]Our data retrieval worker. Ok?

[00:03:32]Now the data retrieval worker that scrapes the metrics. First of all let me write what is its function here?

[00:03:37]Scraping the matrix.

[00:03:40]Ok?

[00:03:44]Second, when it scrapes the matrix, it checks if the endpoint of that matrix is HTTP endpoint based.

[00:03:52]Like whatever metrics are exposed by the HTTP endpoint, it scrapes those metrics. Otherwise do not do this scraping. So how does this scrap the Matrix? It scrapes metrics that are released on HTTP endpoints. Ok? Ok? So should we write it here? How does the matrix return?

[00:04:12]This scrapes HTTP endpoint exposure metrics.

[00:04:15]Now by default we know that Prometheus has a time interval that Prometheus will go every time between this particular time and scrub the metrics of your application and board. So that time interval of Prometheus is 15 seconds by default.

[00:04:32]Ok? and Sometimes It Would Be 30 Seconds.

[00:04:35]Now it depends, this remains by default.

[00:04:37]If you need to increase this time period according to your situation or according to your monitoring server or according to your production environment, for example.

[00:04:43]So that is up to you. You can also increase it. There is just nothing.

[00:04:49]Go to the configuration file of Prometheus and tell it what time interval you have set. By default I told you. Ok? So what does it do? Scraps the matrix of Application of Force.

[00:05:01]Scrapes metrics containing Http endpoints and the time interval is 15 to 30 seconds.

[00:05:07]Now this question also arises, brother, what type of matrix does it scrape?

[00:05:11]Take for example. Now we all know when we do monitoring. One is monitoring at the OS level or let's say one is at the application level, one is at the system level. Anything that we monitor. There are some simple major things which we check like CP usage, memory usage, we check the total HTTP request count on the application, brother we check what is the error count and sometimes if there is a payment service then it is a normal thing that we will also check the payment failure, how many times the payment failure has occurred. So this is how we scrape the matrix. Ok? Second comes our time series database. Now what is this time series database? The data retrieval worker permanently stores these particular metrics that have been scraped. This means that the metrics scraped by the data retrieval worker will have to be stored as well because if you want to see this data in the future, how will you see it? Isn't it? So what do we have for that? Time Series Database.

[00:06:18]What does he do now? Stores those metrics permanently with some extra labels like with some extra information and labels. Now for example, CPUs were extracted and found in percentage.

[00:06:29]It also became known which time stamp these CPUs belong to. Ok? Which node does it belong to? Which one is instant? What is its IPP address? There are some extra labels on this instance. Everything. So he also stores some extra information of this type with himself.

[00:06:43]Third, what comes to us?

[00:06:46]Our last component of Prometheus is Prometheus UI, your console basic. Ok? what happens now? This is your console which we also call as a web server. Ok? Its basic function is that if anyone comes on it and makes a simple query like this query as anyone sends a request. Someone makes a query. For example, I want to see the CPU usage. Have to see the memory usage. I want to see that right now I want to check the activation of my node. So that query is done here in this console. And this console takes your data from the time series and displays it in front of you.

[00:07:22]Ok? Any data like I told you about CPU usage, memory usage, http request count or you can see what is the total count? Error Count Anything. Ok? So this is how it shows you. These are our major components of chromatics that do this work. Now suppose to understand your interview questions.

[00:07:42]Suppose this is your e-commerce based application like Amazon, Flipkart and any other application. Right? Ok? This is your e-commerce application.

[00:07:56]Ok? Now on what is this application of ours running? on the Hubinities cluster.

[00:08:01]So here we have created it, set up the cluster. So my application is running on the cluster. Ok?

[00:08:07]Initially, our payment service is running on it and in the almost starting of the payment service, we have got 10 ports made.

[00:08:15]How many ports are running? 10 codes. Ok? In this way 10 codes are running at your place.

[00:08:24]Ok? In this way, suppose you have 10 codes running. Now the traffic on the application is completely normal. Ok?

[00:08:32]Not much like normal traffic. Suppose during the day up to 5 MB of data comes and some traffic comes.

[00:08:39]Ok? This method is not bringing in much traffic. There is less traffic coming.

[00:08:42]In up like traffic, only our 10 pores can handle that traffic. What happened? Why is this not getting made?

[00:08:51]Ok? Guess this is our traffic.

[00:08:56]We have customers. Ok?

[00:09:02]Our customers must have come from the internet. Must have hit DNS.

[00:09:06]Sent to the load balancer ahead of DNA. Then our load balancer distributed it further. Isn't it? If two load balancers are installed in production then distribution is running accordingly. Going over the notes.

[00:09:15]Then again going to our codes.

[00:09:17]So in this way, the same concept that we have been following is that the customer hits and how the request goes to a port.

[00:09:23]So that's the whole concept, when customers hit, easily our 10 ports were capable of handling that much traffic.

[00:09:30]Now suppose a day came when the company thought that the business is a bit slow right now, so why don't we adopt a strategy, then according to the strategy, suppose the company started an offer, every company, every website, every brand runs offers nowadays, some may be for clothing, some for footwear, some for bags, what each brand does is it provides you offers like you are getting 15% off on clothing section, 20% off on footwear, in this way, brother, in the same way, this application also thought that let us also organize such a sale.

[00:10:04]So guess what he put? Big Billion Day Sale offered. This is how offers are received. Now tell me one thing, who does not go for the offer? Everyone likes cheap stuff. Isn't it? If something is expensive, it will be available at a cheap price. So everyone thinks let's buy it. It was a plan for quite some time. Isn't it? Like during Diwali, many things are available at discount.

[00:10:23]So everyone takes it. So let's just assume that there was a sale offer going on. There is an e-commerce application selling anything, everything from clothing to electronics is available there. Now I see brother that the holiday is going on and there is more customer traffic. Brother, the traffic has increased.

[00:10:40]Many people came to buy in this way. Now the traffic has increased so much that these 10 ports are no longer capable of handling your particular traffic. Isn't it? Because currently there are 10 ports. We wo n't be able to handle that much traffic. So for this we did auto scaling automatically, it is a normal thing, if capabilities are installed then auto scaling will definitely happen automatically, so it saw that it is not able to handle 10 codes, what did it do, it created 50 ports, what did it do, it auto scaled and made 50 codes ready here, that means it made 40 more codes ready, okay now see how many ports were there earlier, 10, now how many have increased to 50, now here comes the most interesting point, you can say the most important question, here comes how will Prometheus know that new ports have been created.

[00:11:26]How will Prometheus know that the brothers have arrived? Isn't it?

[00:11:31]How did Prometheus know? New Ponds have arrived. Let's write it here. How does Prometheus no new powers are created. Ok?

[00:11:51]How could Brother Prometheus know this way? Isn't it? This is the most important question.

[00:11:56]So now what happens is that Prometheus does not go directly to your cluster to ask whether any port has been added? Which one is deleted? Which run is happening? Which IP is it running on? What is the label? No brother, he does not go to the cluster. Nor does he go to ports. Asking your Prometheus scrapes data from neither port.

[00:12:18]He says that from whom will he directly ask for that data? He will ask for it from the API of Cubinities. With whom? APS.

[00:12:29]So what happened now?

[00:12:31]Our API services run inside Cubinities. Isn't it? So its API server has access to the entire cluster. Which port is running, which port is deleted, what is the IP address of which port. He also basically takes it from the database which is running inside his packet. But we are not going that deep. I just understand this much. The API has all the information related to its cluster. Now look, as I said, Prometheus is neither asking the cluster nor is he asking the court. It goes directly to Cubinities' API. He tells him that brother, he needs innovation. He needs to know that brother, this e-commerce application of ours is running on the cluster. What is the total number of ports in that cluster? What is their IPP address? Ok? How many ports are running?

[00:13:11]Ok? What labels are there inside them? Which one is deleted? Which one is also in stop mode, which one is pause. In this way one finds out everything. When he finds out, then see what happens? Probethis does not happen that brother comes to know that 50 codes of this particular application are running. There are 50 codes running in this cluster. So monitor 50 codes out of 50.

[00:13:31]No, we tell Prometheus in the configuration file in prometheus dot yaml that brother labeling is done.

[00:13:38]Basically now let's see what happens. It is not necessary that you tell it in Yaml.

[00:13:43]We can also tell you about your different files. A separate file for one or two more service discovery is created.

[00:13:47]Basically, if you want, create a service discovery file. Insted of Prometheus.dmg Service Discovery creates its own file.

[00:13:53]You define it inside that file. Isn't it? So here we do not mention the file. We can also create a normal file if we want.

[00:14:00]So you mention in a file, in its configuration file, that brother, we will have to see the labels which we will have to monitor such ports.

[00:14:10]Scrape data from ports that have this particular label. So what does Prometheus do? I don't start scrapping everyone. Prometheus looks at whether this labeling is there in this particular code? Isn't it? Like the label for example. A. Monitoring is equal to enabling. So now Prometheus checks each code to see if it has this label before scraping the metrics. When it finds this label on a port, Prometheus automatically scrapes the metrics for that particular port.

[00:14:43]How will scrap work now? It is a normal thing, it will send a request to the endpoint and it will start scraping the matrix.

[00:14:49]They will send their metrics, start scraping them. So now whenever any new code comes, scraping will happen automatically.

[00:14:54]Any code deletion will automatically be detected by Prometheus. Isn't it? Because the API is giving him all the information.

[00:15:01]Every thing. So this is why easier service discovery is possible in the pull model.

[00:15:08]So here the first step which we called Easier Service Discovery is that brother service discovery brother automatically he gets to know about the services. He is getting to know about his targets, which targets were added on, which were deleted, everything.

[00:15:20]So Easier Service Discovery. Second what did he say? Centrifugal scraping. Let's talk a little about centralized scrapping.

[00:15:26]Now look, it is a normal thing, one of our applications is running.

[00:15:30]Only payment service cannot be running on the application, right?

[00:15:32]There are multiple other services also running on application like assume Linux service is running. Radish's is running, my sequel's is running, some card's your service. Ok? A is a user service.

[00:15:47]Multiple services are running in this manner.

[00:15:50]Ok? Should we keep it here inside the application?

[00:15:52]Suppose there are multiple services running in our application. Now Prometheus is a central monitoring server. Isn't it? That is the centralized monitoring server.

[00:16:02]What does he do now? The application is also monitoring all the other services running. Isn't it? Centralized scrapping is taking place. Everyone has the same server in the central way. At the center we have Prometheus doing the scraping of each service.

[00:16:14]Meaning bridging the matrix of each one. Isn't it?

[00:16:20]And in how many seconds? Scrape all the services, ports, notes, metrics of the database in 15 seconds, 30 seconds, every single one.

[00:16:26]Like CPU usage, memory usage, request count, knowing the health of the code, knowing the head of the application, knowing the health, whether the application is reachable or not. Isn't it? In this way. Now observe one thing here.

[00:16:39]Here Prometheus made things a little easier. Isn't it? Now here there is no tension of the application that it has to go and provide data to Prometheus that brother, what does this code of mine say, this is newly created, this is deleted, these are its metrics, save it in yourself, now here everything is being organized centrally, is being controlled with the help of Prometheus server, so centralized scraping is being provided in this way, let us talk further about better health monitoring, what do we mean by better health monitoring, brother, suppose how many 50 ports are running right now, now suppose suddenly some of your ports get deleted or crash. Isn't it? It happens too.

[00:17:17]Look, our CPU went overboard due to overload. Anything one of our ports crashed.

[00:17:27]Now see what happens. Do we know what Prometheus does?

[00:17:32]Sends requests to the API automatically. Isn't it?

[00:17:35]and scrapes the matrix from there.

[00:17:37]This is what Prometheus does.

[00:17:40]Now in such a situation, if any pole crashes, Prometheus gets to know about it in advance.

[00:17:46]Why? Because he keeps monitoring every time in 15 seconds. Isn't it? Now he checked the total number of Pos running. He realized something was wrong. He checked up put the matrix. Zero appeared next to it. So yes, some of our ports are going down.

[00:17:58]He saw the labels. So what will he do?

[00:18:01]What will he do quickly? Will trigger an alert through the Alert Manager.

[00:18:03]Whom? To our engineers. Now whatever channels you have integrated in the Alert Manager, you will go to that channel whether you want to receive information on slab, web, email, WhatsApp, by call, SMS, anything, then it will go to the engineers.

[00:18:19]Now the engineers instantly came to know that there is a code crash in your code. There is some issue in the code or there is a network time out issue. He will tell her anything. Is it a DNS issue or something? So, in a way, Prometheus is also doing health monitoring in a better way and responds to the application in advance.

[00:18:38]As soon as it finds something wrong, if the CPU is, say, 80% or above, it will immediately send an alert, you take action, the engineer takes action automatically, immediately, sorry, not automatically, he takes action immediately.

[00:18:47]So that's why health monitoring is a little better. Better Health Monitoring.

[00:18:52]Forth is speaking Simple Debugging.

[00:18:55]We can do debugging in a simple way. Meaning, for example, if any of our engineers or any developer says that brother, the CPU graph is not showing on Grafana, then what will happen in such a case? The engineer simply has to hit a query. The curl code is to provide the IP code number and look at the simple matrix.

[00:19:18]Debugging the issue here will become easy. You will know what the issue is. What is not there, right?

[00:19:23]You can even watch it directly by visiting Maths Grafana.

[00:19:25]Grafana is not running, go and check Protheus. Isn't it? If people get the add-on done then you can see people. Isn't it? So, by running these simple commands you can completely check whether the endpoint is working or not. Isn't it? Meaning, you got to know everything easily here.

[00:19:41]Debugging is easy at one time. Otherwise, brother, you will first extract the logs separately, troubleshoot them, debug them, it will take time to do it manually. That's it, your work has become easy. That's how he's saying targets expose matrix on. Now see what happens, we know our targets, who are the targets? We have resources like these, whose matrix we are scraping, pulling.

[00:20:05]So all these are our targets. Now what happens here?

[00:20:09]These targets do nothing. Release your http endpoint and automatically scrape those metrics using Prometheus' data retrieval work.

[00:20:18]We have some OS based metrics which we cannot scrap directly.

[00:20:22]So we have exporters. Isn't it? Some of the time we have client libraries for applications.

[00:20:29]So this way we have a lot of like ways through which we can automatically scrape the matrix with the help of chromethis. It would take the same time to do it manually. Go to each server and check it. Look at each thing one by one. What is the issue?

[00:20:44]What not? Look at the CPU, look at the memory separately.

[00:20:46]How many engineers will be needed?

[00:20:48]How much money is there? So these are the five best ones because of which we use the pull model.

[00:20:53]Now the question arises why don't we use the push model? Does that cause issues? Like three things are already mentioned before us. Isn't it?

[00:21:02]Hard to know if Target is a life. Matic Skin Up Becomes a Distinguished and Scale Scalability Problem. Now see what happens, what are the problems of push model? If we use the push model, then the monitoring server of Prometheus, like our monitoring server, is our Prometheus, first of all we should know its address, as in like brother, we are getting to know its complete address. On which IP is it running? Isn't it? Which path is it running on? We should know the complete address of every thing. We should have one type of information about it.

[00:21:40]Ok? For push models. Because in the push model, the targets will be sent from here themselves saying that brother, this is Matric. That's the CPU use. He told Prometheus, right? After extracting the matrix, brother, first he sends the request to the API and the API extracts it and gives it to him. The API tells us that brother, this is HTTP and matrix. You should see for yourself which codes you have to monitor. The opposite is happening in push. Here we need to know the address of the monitoring server. One should know how to maintain push logic, how will the push happen? This should be a complete process. Should know.

[00:22:09]If an application crashes or there is a network issue, push fails, then the monitoring server gets confused.

[00:22:16]Brother, he will not even know what happened in the application. According to him it is fine. The application is still running fine. Isn't it? So here it is that brother, you cannot know whether your target is alive or not. So what is he saying?

[00:22:30]Hard to know if the target is alive. Because if the code crashes here in your cluster and that information has not yet reached Prometheus, then it will think that the port is still running, what is our second problem saying, matrix clean ups become difficult, that means brother, what happens, now see, if any of our ports have been deleted, it has not been cleaned up, that means its old metrics are still visible on our data which is our Prometheus server, that means if one of my ports has been deleted, that information has not reached Prometheus that this port has been deleted, so Prometheus is still scraping it.

[00:23:02]According to him, it is still going on. Isn't it? That brother, your port is running. He doesn't know that the port is down. That's why when we scrap automatically. He sees it automatically through the bridge model.

[00:23:12]Good port deleted quickly puts him in a stare situation. Isn't it? So matrix clean up becomes a little difficult. Because we are in a way scrapping the ghost matrix which is dead. Isn't it?

[00:23:27]Third is saying that there are scalability problems. Scalability problem means if suppose there are 10,000 containers running on our cluster.

[00:23:37]Ok? What happens if there are 10,000 containers in the cluster and all of them start pushing metrics to the monitoring server at the same time?

[00:23:45]It will get overloaded. Isn't it? Somewhere your push model will fail. If overloaded, the CPU will spike. Memory will spike. Your monitoring server could suddenly go down. Anything can happen.

[00:23:57]That is why it is said that brother, what should you do? Use pull models more and centrally also, pull model is used in Prometheus so that it can monitor in an easy way and scrape everything in a centralized way in a control way.

[00:24:12]Ok? So the normal and good thing is that it does your entire scraping in 15-15 seconds. Isn't it? And that's why brother, the best for monitoring for us is Prometheus which provides us only pull model.

[00:24:26]And the pull model is better than the push model. Ok? And for large infrastructure, for big companies, the bridge model is the best. You will get confused with the push model.

[00:24:37]Complexity will increase. Nothing Else.

[00:24:39]Ok? Because engineers will do everything manually.

[00:24:41]Manually deliver your metrics to Prometheus. For that they should know the address. Must know how to process manually. We will have to learn how to push everyone's metrics, not all at once. So that is a complex task. I have sat down brother. You've got Prometheus. The bridge is being modeled. Everything is happening automatically. Your attention is gone.

[00:25:00]So that's it for today. We now meet Prometheus in a new interview question. And side by side if you want to know more about it then you can follow my playlist of Prometheus. Because in this, look, I have picked up major things so that we can understand the interview questions. Rest in deep in that playlist.

[00:25:17]So thank you guys. Ok, bye.

#DevOps #Prometheus #Kubernetes #DevOpsInterview #PrometheusInterviewQuestions

Related Videos

Computer Science

Agentforce NOW AMA: Build with React and Salesforce Multi-Framework

SalesforceDevs

490 views•2026-05-28

Computer Science

How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust

aiDotEngineer

450 views•2026-05-28

Computer Science

WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅

LearnwithSahera

1K views•2026-05-29

Computer Science

More tests are always better? How to use AI to identify tests that bring little value

Alliance4Qualification

335 views•2026-05-29

Computer Science

Search Algorithms Explained in 60 Seconds! 🤖💨

samarthtuliofficial

218 views•2026-06-01

Computer Science

People of Game of Thrones using JavaScript DOM

AltCampus

296 views•2026-05-30

Computer Science

Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA

ascensionix

107 views•2026-05-29

Computer Science

So What's Odin Lang Even Good For

TechOverTea

131 views•2026-06-01

Trending

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30

The Fastest Way To Board A Plane 😮

zackdfilms

6504K views•2026-05-29

Artificial Intelligence

DOOM Runs On Everything...except Neo Geo

ModernVintageGamer

143K views•2026-06-01