Statistical models are mathematical tools used to understand data, predict patterns, and forecast future outcomes. The five-step process involves: (1) defining the problem, (2) collecting quality data, (3) choosing an algorithm, (4) training the model, and (5) evaluating its performance. Linear regression (y = b0 + b1x) analyzes relationships between variables, where b0 is the intercept and b1 is the slope. Logistic regression handles binary classification problems. Clustering techniques group similar data points based on characteristics using methods like K-means and Euclidean distance. Model evaluation uses error matrices (actual minus predicted values) and accuracy matrices. Ethical considerations include data privacy, fairness, and avoiding bias in data usage.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
11th Class Computer Chapter 5 | Building Statistical Models (5.3) | Class 11th Comp New Book 2025Added:
Assalam Walekum students. My name is Etesham Khan and I am your computer teacher.
So son, our topic today is building statistics models or statistical models. Ok? First, let's look at the definition of statistical models and what it means to us. He says that we basically use it to understand the data. I have already given you the definitions to understand the data, to predict the patterns which you have, what do you write?
Futures outcomes emerge. Suppose I tell you that you predict anything by looking at your data that this thing is existing with me and what impact will its existence have on my results in the future?
What do these come to us? These are the statistical models we have. Ok? Now let us see its first steps.
Five steps that will lead us to understanding.
So first we have Define the Problems. So what we want to predict in this is that basically in this we are talking about problem solving, how the problem we have is defined. In a good problem, you should know what predictions are being made.
Suppose what you have to do?
You have to buy groceries for home and you predict how much amount it will cost you. So what are the scenarios for that? First you should know your family size. Do you predict accordingly how many family members you have? How many laws are there in the family? You act according to that.
Then your location matters a lot.
It matters a lot whether you are from rural areas or urban areas. Then after that societies also matter to you. There are some areas where what do you have? The products that are available are in low quantities. At some places the tax is higher and the amount increases there.
So location matters a lot.
And the third most important thing is that your income matters.
What do you do according to your income? Make your own predictions about anything.
You know that if a product is worth ₹50 then you will get the same product for ₹500 also. So it depends more on who you are? Above income size.
What is your income?
What will you do with your groceries accordingly? Will maintain it.
Son, the second point you have is to collect the data.
See, a good prediction or good statistical data is based on requirement gathering or data gathering, the more quality of data you have, the better will be your predictions or outcomes, so it is said that if you gather all information, it will help to solve problems, for example, if I am talking about groceries, then you should know what your grocery bills were earlier, many times it happens that when we buy groceries, what we do is that for the first time itself, if the bill comes to us for Rs.
700, 7000 or 10000 or 15000 or 20000, then the next time you buy groceries, it will almost be 5% or 3% or 2% more or 2% less than that. So what does it come to you? You have this, on what do you base the data? Your past crush on Bills. Then the number of family members you have also matters. Many times it happens that the first time you had less family members and now they have increased or earlier you had guests at home or some people had come to stay or you are living in a hostel and there were 10 people but now you have reduced your number. So that also depends on you.
So it is said that the better the information, the better will be your data?
Quality wise it will be good. Food prices also matter a lot, income also matters a lot and lifestyle of habits also matter a lot. ok sir.
Then there's the Choose N algorithm. Look sir. Now for this you will have to make a model. An algorithm will have to be created. So today we will discuss two types of models. One is linear advancement and the other is logistic advancement. Is it okay son? Now we have the fourth point, train the model. Look sir. Now you have your model, you took the old data and did testing on the model.
Now you have trained your model. Now you will take new data. We will take completely new data and apply it on it and check whether my model has been trained as per it or not. That is, whether the algorithm has been trained or not. As family size increases, are family income and expenses also increasing? If the number of family members has increased, then are the income expenses increasing or are they reducing or are they keeping the same. So he says this also matters or if you are increasing the number or quantity of grocery products, then is the income or size increasing due to that or not, this matters a lot. Now evaluate the model. Now you check the old model, you enter new data in the model which you have checked and check whether your model is working properly or not.
After that comes linear regression.
This helps to understand the relationship between two variables. Suppose here I write a variable child.
I say x = y + 1 and I make y = 3. Let's make y = i 1. So here y + 1 + 1, so see this is the dependent variable and this sorry sorry sorry son you have, this is the dependent variable x on y and y is the independent variable. If I keep the value of y as one, then 1 + 1 2 x depends and the value of x changes to two. If I make the value of y -1 then 1 - 1 + 1 = 0 so now the value of x becomes zero. So y is a dependent variable and x is a what do you have? x is a dependent variable and y is an independent variable.
Ok? Similarly, if I explain its example to you, then you will have study ors versus marks. If a person has studied very well then obviously his marks will also increase. And if a person has not studied at all or has studied averagely then his marks will also be wasted on him.
Ok? It is never possible that a person has worked hard and has not got the reward or the result of it.
Ok? So it says it matters how many study hours you have. Look, on whom have the marks become dependent? study study hours which is on top of study hours. Now marks will come only as much as that person studies or as many students study. Ok? Now we come to step one.
We check this through the model. He says now suppose a person has set up a fruit stall and customers are coming to him there.
And he says that if 10 customers come to him then he earns ₹500. If 15 comes then he is scoring 700 runs. If he scores 20, he is scoring 900 runs. If 25 comes then he is scoring 1100 runs. And if 30 comes then he is scoring 1300 runs. A pattern is forming here. With the increase of five customers, the amount that is increasing for you is increasing by Rs 200. ok sir. Now we have the formula. Multiplying y you have y = bta0 + 1x +. Look, listen to me.
Here you have the intercept. What does it mean that what is the minimum value of x that you can have? So if I tell you that, how will it come out to you? For example, if you look at it, what is the difference between these two values? Like I just told you, how much total value do you have of 200? How many total customers are there? Five.
So the minimum value you will get as a result is 40. So see this b1 = 200 / 5 = 40, we have not found the intercept yet, we will see that now. Okay, now what do we have inside Understanding BO? Let us see that there are 10 customers and how much am I earning? There are 500 customers and how much is the earning? 500. Now see, I have a customer. What did I say?
Consider the customer, what do you have?
Consider x and earning, y is like this brother, if the customers increase then the earning will automatically increase, so this earning has become dependent on whom, on the customer, well now what will happen, we have to find out the value of y 500 b, b1 we have 40 which is the minimum value, multiply it by 10 customers which you have here. ok sir.
Now we have B zero turned out what is that? 100 has come out. So, what has it turned out to be with us? The intercept came out to be the minimum value. Now what is the final equation we get? Final Equation Earning is equal to 100 + 40 * Buy Customers. Many times it happens that a person says, look, what we are talking about is about random customers. There will be some customers who are fixed with that guy.
Suppose that guy has 22 fixed customers or 30 or 20 fixed customers. So if you multiply that, what do we get with 20 customer pay values?
He will definitely get Rs 1100. What does it mean? That even if he does not get any customer, he still has this earning. Many times it happens that a person gets the stall set up by someone else. Even if his customer does not come, the person sitting with the stall is still earning. He has to get a proper amount. Did you understand?
Good sir. After that, the next thing we have is what does logistic regression come in? This is a powerful tool which we use.
When We Want to Predict an Outcome. Well, if I talk about this, then no formula of any kind has come in logistics advancement. Like here we had what we had to do? The relationship between two variables had to be seen. This is basic, basically based on binary values x and y like yes and no based on true and false based on marks the student will fail or pass.
Meaning yes it is based on one and zero.
So it is said that it is categorized as yes and no, pass and fail. So if I talk about its types, then son, there are two or three types of it. We have a binary, a multinominal, and an ordinary logistic. So these were the regressions that we had to study today and you have to look at them carefully. So son, today our topic is about building statistical models. So son, the topic we are going to see today is the technique we have. Like we studied regression, we studied logistics and now we have a third technique called cluster techniques. This is one such way we have.
If I tell you simply what it is? In this, whatever similar data you have, you either separate it or you group it. If I give you an example of this, you have genders sitting with you. There are two genders. There is male and female. If I say that you have to group them, what will you do?
You will apply clustering technique on this in which you will separate the females and separate the males. So this is what you did gender clustering. Ok? If we see its definition, it says that this is a way of grouping in which you put similar things together and based on their properties and characteristics, what do you do on the basis of its characteristics, you group them, if I give an example here, clustering the students, we see that we have students, these are five given students and in that, some of our children are from Maths and some of us are from English.
If I see it, I would tell them that we have to separate the Maths toppers and the English toppers.
So if I talk about the topper in Maths, the topper I have in Maths is Umar and the topper I have in Maths is Ali. And if I talk about the topper in English then I have Talat and at second place we have Annie. So what do we have? We considered two toppers each and grouped them. Now how do we verify what we have left and what we don't have?
Not a topper. So son, for that we supply K means clustering techniques.
We check that, what work does it do for us? So suppose I have, well, before that, let me tell you a term here.
What does the term machine learning mean?
This is basically all the fields of AI that we have.
machine learning. What does she do? Does it help you and what does it do to your performance? Does better.
So there are two terms we use in machine learning.
We have one that is unsupervised and one that is supervised.
What does unsupervised mean? Suppose I have these five images or three or four images. And what if I do? I can recognize him just by looking at him. So this is the unlabeled data I have. Suppose I have this picture of a cat. I do n't have Cat written underneath it. I have this line, I have not written it. It is not written that he is human. This is a mail or you can call it a boy.
I did not write. And this is my school picture. I did not write it. I mean, I just put up Kansa pictures.
So we call this unlabeled data.
Which we call unsupervised. And we have one more, if I enter the name in it, then we call it supervised data. Is it okay son? What do we do now? Now if we want to find the distance between these toppers or compare them, we want to see which topper is next to him or who is at third or fourth position? So for that we apply a method. We call this the Ecliptic Ah Distance Formula. Well, what does it do? This tells you the distance. Let me tell you one more thing. The one whose distance is far will go to the lowest position. We will consider whichever is closest.
Ok? Now suppose I take a centroid. Let me take the Maths topper first. One is I have Talat and the other is I have Ali. It is like that. If I talk about Ali, I have Ali, then the topper of Ali has 85 in Maths and 70 in English.
Talat has 40 in English. There is 40 in Maths and 75 in that. If I talk about distance, Euclidean distance formula, then I have this, I will consider it.
x1, y1, x2, y2 Equilibrium Formula What do you come up with? You can also basically call this the distance formula.
x2 - x1² + y2 - y1 Whole square its whole square comes and the square root comes above. Is it okay baby? What do we have now? I take it first. Suppose who is this with me? I check the age here.
So what should I do? I will compare age to age. With whom? Which I have is a maths topper. Suppose I called him Ali. Ok? Now with me or yes Ali. Now see what I am doing? I am considering this x1. I'm taking this y1 this x2 and this son y2. Now let us apply the formula. So if you look at what I have? x2 x2 - x1 90 - g So we have x2 - x1. x2 We consider it like this. Let's do this, let's do this x2 y2, let's do this x1 y1, okay? Now if we see, I have x2 - x1, so 90 - 85 and then we will get x1 y2 - y1, so we got the answer 25 + 25 and we almost got the answer, son, okay, in the same way I am doing it with whom, now I am doing it with Umar again with whom, with Talat, so I found the formula for it also, so if you see its values, we got almost big values. If I were to tell you, it would come to around 30. That means this is a larger value than edge compared to k.
so what does that mean?
If I talk about the closest distance between these two, then who is he coming closest to? It's coming closer to us, closer to A. So here I will write U and I will write 90 and 65. Is it okay son? So what does this bring us? Look, it did the grouping, it considered only the data that was formed in or around its group members and brought it to it. What should you do likewise? Do Ani first with A then with Dalad. Then you do Maliha with Ali and Talat, then you will know which one will go to which group. Ok?
Okay son, this is a clustering thing we have. There is a simple paragraph given in your book but I have given you some details so that it will be beneficial for you.
After that, if we talk about evaluating and interpreting models, what do we get? If I talk about evaluation, I have written a term.
Evaluate means that you tell whether your data is correct or not and interpret means that you reveal what the result is or reveal what the result is. So the first subtopic we have in this is performance matrices. There are two types of it. First you get the error matrices and second you get the accuracy matrix. Kids Watch What is an Error Matrix? Let me explain it to you. It is like this that in this, because we had read the previous topic about grocery store, so he has given the example of grocery store only, suppose brother, you thought that my budget, my prediction is that the grocery will come for Rs. 8000 and when you went to buy groceries, it came to you Rs. 10,000, so if I talk about actual price minus prediction here, then when I subtract it, then what do I get. The difference that I am getting is Rs 2000. So basically 2000 the error of that particular prediction. Ok? After that we have the accuracy matrix coming up. For example, if I talk about accuracy and I say that 85 students out of 100 will pass.
And almost our result is 83 84. 15 will fail. 80 will pass.
If even 80 pass, it means that my prediction is accurate somewhere. But if I suppose I am 15 clear out of 100, then it means that my model is not correct. Is it okay son? After that, the next thing we have is interpreting output. So I have told you that this understanding we have tells you what the result is. It's like what do you do? You draw the conclusion. You should know what the results are brother.
You should know its conclusion and what its end result is. And the second thing you have is ethical consideration. What is this? In this you are told to have fairness and no bias. See, son, what does ethical have to do with it? Ethical is related to privacy. If I talk about it, then in this we talk that the data we have should be ethical. There should be data protection in this. There should be security of data. So it is said that whenever a bank or any company gives you a loan and takes data from you, it has the responsibility to be fair with your data.
Remain biased. He should not use your data anywhere else.
Similarly, we have data privacy.
Similarly, you can take this example that they can never use your data or share it with anyone without your permission.
If I talk about NADRA or if I talk about the Board Office, then they never share your data without permission until or unless you are in the criminal record, they protect your data and it is the responsibility of that company or that authority to secure your data. So son, we had two topics. One we had was Clustering Techniques and the other was Evaluating and Interpreting Mother. So these were both the topics we had about building statistical models. Is it okay son? Thank you very much. Thank you very much.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











