Install our extension to search inside any video instantly.

Tutorial 5

Added: 2026-06-19

219 views353:33shizzaasher2965Original Release: 2026-06-19

Simpson's paradox demonstrates how a treatment can appear beneficial within each subgroup but harmful when combined, due to confounding variables that affect both treatment assignment and outcomes. In causal inference, controlling for confounders through multivariate regression or conditioning on covariates is essential to isolate the true treatment effect, as failing to do so can lead to biased estimates that contradict the actual causal relationship.

[00:00:00]With that, let's get started. We will be reverting to version one of this notebook at versus difference in means.

[00:00:07]I just want to reiterate this concept with a simple example.

[00:00:15]All right. So, in order to measure the average treatment effect, we are given the following data frame. You have five ids. You have five individuals. T equals 1 for three of these individuals. This means individuals that were assigned to the treatment group where the treatment was administered and then there are two individuals for which t equals zero.

[00:00:40]These individuals were assigned to the control group. Then you have the outcome given that the treatment was administered and the outcome without the treatment. All right. So this is yi1 and yi zero.

[00:00:56]uh I will first of all try and find the individual treatment effect. So for each individual this is fairly straightforward. I will do y I1 minus y i0 for every individual in pandas we know okay this is really straightforward you just declare um a new column name specify and then you equate it to well whatever operation you're performing between these two. So this is what that looks like. Individual causal effect equals DF outcome with treatment minus DF outcome without treatment.

[00:01:34]After which this is what our data frame looks like. What I will do now is I will just find the average of these individual causal effects. Hence I get the average treatment effect which in this case is equal to 0.14 t is positive value. Uh maybe it tells me that my treatment has been beneficial for these people.

[00:02:04]Okay. Now the problem is although in the real world you can only have one of these two for an individual because the thing is if an individual was given a treatment to measure only the outcome with the treatment this is a counterfactual you cannot measure the outcome without the treatment this is what would have happened if the individual was not given the treatment. ment. So in the real world you actually have only 0.8 against this individual, 0.7 here, 0.5 here and similarly 0 K against you only have this value 0.7 here and 0.9 here because if this individual was not given the treatment to observe measure their outcome without the treatment this would be their potential outcome with the treatment. In actual you cannot really measure that and this is where difference in means comes in. Difference in means is an well it's an estimator for the average treatment effect.

[00:03:15]However, we've studied the difference in means is not always equal to the average treatment effect. But let's look at this scenario. Difference in means find how would we do that? So what I will do is okay I will select all the individuals that were given the treatment and I will find the average of their outcome with the treatment because this can be measured. Similarly I will filter out all the individuals that were not given the treatment and I will well I have I already have the value of their outcome without treatment. So in average when I subtract the second average from the first I will get my difference in means.

[00:04:04]So let's try and do that. This is fairly straightforward but really I definitely do want to show you guys how I have split my data frame. So I have a treatment group now.

[00:04:20]That's my treatment group. And then I have a control group. So this is what I did. Treatment group values measure control group. I'm measuring these two values. So now I can just find the means and then subtract them. This is essentially what we are doing over here.

[00:04:43]So I find the treatment mean filter. So for the treatment group it is the outcome with treatment dot mean.

[00:04:52]Similarly, I find the mean for the control group, the outcome without treatment dot means and then I calculate the difference in means. And this is quite interesting actually. My average treatment effect was 0.14 which kind of told me okay my treatment is beneficial.

[00:05:08]It has a positive impact on the outcome.

[00:05:11]But my difference in means is0.13 which kind of tells me if you give the treatment to an individual outcome it is it is not beneficial for them. It has an adverse effect which uh probably is not going to be true because we have our exact value of AT over here with us as well. So the thing is though kap is typically not equal to difference in means.

[00:05:42]This happens because of confounding variables a lot of which go unobserved.

[00:05:48]Um and the thing is we are trying to well in a randomized control trial you put individuals in the treatment and the control group at absolutely random. So if I were to take a few individuals out of the treatment group and swap them with a few individuals of the control group though ideally in an RCT now what's going to happen is okay it will have no impact whatsoever but in a real world and in an observational study where you are already given data so you don't know data and there could be confounding factors involved when there are no confounding factors involved when there was randomness in assignment that then the difference in means is going to be equal to the average treatment effect.

[00:06:44]All right, with that we are going to move on to version two of our notebook and version two includes uh a practice question on conditional probability.

[00:07:00]I believe that yeah inshallah we will include one in ungraded homework 4 as well and I will highly encourage you to attempt these questions before your upcoming midterm exam.

[00:07:15]In fact even for those of you listening once we go over this so before we solve our probability problem just pause and try to attempt it on your own. You will benefit from it. Then if you're unable to just go ahead and tell me your solution. All right, let's look at this scenario. Consider a doctor who administers two different treatments for patients with kidney stones depending on the size of stones. Treatment A is surgical and then treatment B is simple pills. So naturally, your doctor prefers surgeries for people with more severe cases and larger stones.

[00:07:56]um and well pills otherwise. The following are the success rates of her treatment after a year. So let's see what happens here. Um you have treatment A, treatment B for small stones, large stones and then you have combined results percentages the percentage of people that recovered in each group that is the success rate of the treatment as you can say. So in essentially I had 87 individuals with small stones who were given treatment A 84 of whom recovered which means that there is a 96% success rate for treatment A on individuals with small stones. Similarly you can interpret the rest as well.

[00:08:45]So the first question is if a new patient comes in with kidney stones and suppose okay this new patient doesn't really know the size of their kidney stones. So should your doctor recommend treatment A or treatment B just based off of this data stones size. I will encourage you to pause and think.

[00:09:07]So what I observe over here is individually treatment A performs better than treatment B on small stones. On small stones, treatment A success rate 96% and then treatment B success rate is 87%.

[00:09:24]Similarly, treatment A per outperforms treatment B for large stones as well. So treatment A large stones 73% success rate and treatment B large stones 68% success rate. So this is uh one possible way of approaching my problem. I can say that um the doctor should administer treatment A because treatment A works better than treatment B for both large stones and small stones success rates.

[00:09:56]Alternatively, when you combine these results for both kinds of stones, you observe now treatment B has a success rate of 82%.

[00:10:07]While treatment A has a success rate of well 79%.

[00:10:14]So you could also say that the doctor should recommend treatment B because overall it performs better. Overall it has a better rate of success. So, how did I compute these percentages? Just to tell you quickly, I had 87. Oh, my bad.

[00:10:33]87 + 263. These were the total number of people who were given treatment A. And then 84 + 192. These were the total number of people for which treatment A was successful. And similarly, I computed this for treatment B as well.

[00:10:51]This is fairly paradoxical though. And I will help you all recall. This is the Simpsons paradox. Simpsons paradox. What we observe is individually across groups rates probabilities um they flip once we merge groups together.

[00:11:15]So or like vice versa. That's the Simpsons paradox. Uh why does it happen though?

[00:11:26]This we will explore in our very next notebook. We will go over another example. So let's not go there right now. Rather let's figure out uh how we are going to so do you see any contradictions in the data? You can go through this on your own. Right now what we are going to do is Let's make a quasal directed as cyclic graph for this problem. Um if you would like you can pause over here and try and construct this graph yourself as well.

[00:12:04]So I have three things in my graph.

[00:12:06]Treatment, size of stones and recovery rate. This is what my graph comes out to be. Of course, the treatment is going to impact the recovery. But size impacts treatment and size impacts recovery as well. So now let's recall this particular structure that we observe in a causal DAG is called a confounder. Size is a confounding variable. A confounding variable is something that affects both your treatment and your outcome directly or through another variable.

[00:12:44]So confounder the problem essentially is when I'm trying to measure the effect of treatment on recovery I am not really able to isolate the effect of treatment on recovery just by looking at my data because if I observe recovery I cannot be certain recovery treatment size of stones to put it even more simply Maybe people with smaller stones are more likely to recover. People with larger stones are more likely to recover. One treatment could definitely be more effective than the other. But um maybe I'm just not administering the more effective treatment to a lot of people just based on the size of their stones which is small.

[00:13:36]So anyway, what are we going to do over here? we have to control for size. And how do you control for size?

[00:13:47]So in order to find the treatment effect over here, you will have to control for your confounder. This is how I will do it. I will find the probability of recovery given that my treatment is equal to A and then I will find the probability of recovery given that my treatment equals B while I am what am I doing? I'm controlling for size. So this will be equal to the probability of recovery given that the treatment is a and the size is small multiplied by the probability that the size is small and then what I will do is large size.

[00:14:47]So this is how I will find the probability for treatment sorry the probability of recovery given that the treatment is a and I am controlling for size.

[00:15:05]I can do the exact same thing for B as well. This is how I would do so and it's important to understand over here is this example you just have one confounding variable. So this is how you went about it. Um what if you had two confounding variables or multiple confounding variables? So recall from your um causal inference lecture and your lecture slides up her co-variant sorry notariound.

[00:15:40]So suppose I also had age and age had two possible values young and old. So this is how I would have to go about it and then I would have to multiply it with the probability that the age is young and I would have to make all combinations essentially. So size is small age is young. Size is large age is young. Size is small age is sorry size is small age is old and size is large age is old. So two values add three values. Similarly, if I have three confounding variables, then I have to control for all three of them over here.

[00:16:25]All right. Anyway, um I am going to leave the solution over here with you guys because now you just have to plug in values. So, what I'm going to do is probability that the treatment is a and the size is small. Given head treatment is a size is small. probability of recovery is 96% 0.96 then that is multiplied by the probability that the size is small so what is the probability that the size is small um I have 350 + 350 stones 700 stones in total 87 + 270 of these stones 357 stones were small the rest were large stones. So I have 357 divided by 700.

[00:17:21]Similarly you can plug in the rest of the values as well. And for your information uh so this is what was given to you in the slides as well. Adjusting for covariants. This is how you adjust for covariants and this is the solution for the problem. So exactly math you just have to do it for you just have to plug in for every single entry over here and so it comes out that the probability of recovery given that the treatment A conditioned on size is 85%. And the probability of recovery given that the treatment is B uh conditioned on size is 78%.

[00:18:10]So now I can confidently say that treatment A overall performs better.

[00:18:21]With that, let's move on to our Simpsons paradox notebook.

[00:18:33]So, we're going to go over a case study and we're going to understand this better. We're going to ground it with a good example. I don't want you guys to focus on the data generating algorithm right now, but we are using a data generating algorithm in this example. I will definitely circle back to it though.

[00:18:58]So let's go over what we are doing over here. Let's give you some context. So suppose this is interesting. The world is confronted with a rare disease for which the world's public health agencies are urgently trying to find therapies or solutions. Scientists have proposed a new drug, but its effectiveness has not fully been demonstrated yet. So, they've introduced a new drug, but they're not sure how effective it is right now. As your government's top data scientist, you have been tasked with finding evidence of the drug's effectiveness.

[00:19:38]This is not a light task, and your feedback will inform the country's treatment policy. However, there is not enough time to run a clinical study.

[00:19:52]Instead, observational data available So let's see um instead you have access to data that was collected in a neighboring country which has already started to use the drug on its patients and the data set contains the following variables.

[00:20:12]So what do we have over here observational data or country these are the variables that are given to us. Z is the severity of symptoms. Z can have only one of these three values either mild which is denoted by zero or strong which is denoted by one or critical which is denoted by two. Then I have another variable a um a just tells you whether the treatment was administered or not. A value one it means that the person was given treatment value zero the person was not given the treatment they were in your control group right now and then I have why why is the survival after one month and why values either the person is dead after a month or the person is still alive okay with that let's get started I will show you what your data looks like as So I have 10,000 rows and three columns Z, A and Y. Once again Z is the severity of symptoms. 01 and two surface key value where one is where sorry two is critical.

[00:21:28]Then A is whether the treatment was administered or not. And Y just tells you whether the person was alive after 1 month or not.

[00:21:36]So let's see this is what you are interested in measuring the average treatment effect which means the chances of survival if someone takes the treatment versus the chances of survival if someone does not take the treatment.

[00:21:51]So uh like I mentioned we cannot figure out whether the person was dead or alive if they weren't given the treatment given that they were given the treatment. Let me elaborate on that. Suppose individual on the zero row uh this individual had severe symptoms too sorry critical symptoms to one and this person was given treatment. So outcome that the person died. The thing is though, what happened if the person was given the treatment?

[00:22:31]What could have happened if the person was not given the treatment? That would be my counterfactual. What would have happened if the treatment was not administered? So I cannot directly just compute the AT. What I can do is find the difference in means which we just looked at right now as well.

[00:22:52]So this is the difference in mean. Um you have the sorry you have the expected value for the treatment sorry for the recovery for Y given that the treatment was administered minus Y given that the treatment was not administered and these are the averages that I have been able to compute.

[00:23:24]So the these are what my averages come out to be. When I take a difference of these averages, this is what I get.

[00:23:34]I get0.11.

[00:23:39]Uh to clarify, that's not exactly your that's the that's an estimate of your AT. Okay. Often students confuse these things. You're finding the difference in means which is giving you an estimate for your AT.

[00:23:56]Anyway, uh when I find my mean over here and I subtract it for the mean from the mean for this group, this is what I get.0.11.

[00:24:09]So what do you conclude from this? I want you to quickly pause and think about this here as well.

[00:24:19]So essentially this tells us that the treatment has a detrimental effect on the outcome of the individual. Treatment administer you are increasing the likelihood that the patient will die in a month. So this is fairly odd. This not only tells you that the treatment is ineffective, in fact it tells you that the treatment will make the outcome further worse, which may or may not be true.

[00:24:47]So anyway, you start to question these conclusions and you decide to analyze whether the treatment effect varies depending upon the severity of symptoms against so what am I doing? I am essentially conditioning right now on my well on my symptoms and I just want to see well I'm not exactly conditioning on it yet. I will do that along the way. I will find the mean for each individual group like I want to understand patients symptoms what was the rate of recovery how effective was this drug and I also want to see how effective this drug was for patients symptoms severe.

[00:25:41]So this is what I find. Interestingly, all of these values are positive.

[00:25:49]However, my combined value was negative.

[00:25:53]So let's see this is what we see the conditional average treatment effect by disease severity. And I see positive values for all three symptoms. This is essentially exactly what the Simpsons paradox tells you about.

[00:26:13]I observe an effect in my data across different classes that completely reverses as soon as I merge all those three classes.

[00:26:26]That is your Simpsons paradox with a slightly more technical example. So what do we conclude? So while the treatment is detrimental overall, it has a positive effect on the survival rate for each individual group. Those with mild, strong and critical symptoms, they all are recovering. Now we're puzzled and we want to dig deeper. But I quickly want to tell you is construct causal causal directed asyclic graph. we will realize that the severity of symptoms is actually a confounder.

[00:27:06]But let's hold on to that thought. So now I make another plot for the probability of receiving the treatment with respect to symptoms. And what do I observe over here? A fairly positive trend. People symptoms are more likely to receive the treatment.

[00:27:29]So people with critical symptoms are much more likely to receive treatment than people with strong symptoms which are more likely to receive treatment than people with mild symptoms. So what do we observe over here? The probability of being treated clearly increases with the strength of the symptoms. That's probably because of some policy um that the government had introduced symptoms certain amounts are the strongest so they will want to treat the patient for it. So let's look into the relationship between the strength of these symptoms and the probability of survival. Now I just want to quickly show you these previous graphs so that you do not mix these up.

[00:28:19]conditional average treatment effect by the disease severity. So essentially difference in means find we found exactly that but we found it for each individual symptom group.

[00:28:32]After that we made a drug which showed us the probability of receiving drug with regards to symptoms for each individual symptom class. Now we are looking at the probability of survival with respect to symptoms. So I can also see okay as the symptoms become more severe the probability of survival decreases.

[00:28:59]So between my three variables now what have I observed that well this is clear the probability of survival clearly decreases with the intensity of symptoms. Now I know symptoms intensity determine whether the people will receive their treatment or not. You know, I can see that from this graph right here. If the treatment if the symptoms are more severe, then the people are more likely to receive treatment. So, the severity of symptoms which is Z is impacting whether the treatment is administered or not which is A. Similarly, Z is also impacting Y which is survival.

[00:29:46]However, I am just trying to observe the effect of Y on Z. I am not able to isolate this effect because this is what my causal directed as cyclic graph comes out to be. Z is a confounder. In order to isolate the effect of A on Y, I have to be able to um I have to be able to control for Z.

[00:30:12]So this is what the causal directed as cyclic graph comes out to be. There are two paths between A and Y. Path that path is transmitting um sporious correlations from A till Y. A till Y directly path here that is the only path that will reveal the true total causal effect of A on Y. So what essentially do we do in this scenario?

[00:30:44]H I could I could control for Z and in this scenario where I am not able to control for Z. I want you all to recall controlling for Z is the exact same as intervening and fixing the value of A.

[00:31:02]Uh right now I have a certain data generating algorithm. That data generating algorithm which I have not walked you through yet ensures Z is causing A in some way and Z is also causing Y in some way. So essentially if I'm able to get rid of that and just you know replace it with a coin flip then and only then I will be able to determine the true impact of a on y. So I just want to show you our previous data generating algorithm a bird's eye view at this point.

[00:31:42]This is how you are generating data. Now let's look at this uh the random integer generate here for Z between 0 and 3. So 0 1 yeah two set of generate this is the number of samples that I'm generating.

[00:31:58]So clearly nothing else is impacting ZZ is completely random. And then I have policy policy may I am specifying the probability of treatment for each Z. So I want you to really focus on this 5% chance of receiving the treatment if Z is zero. So 5% chance of receiving the treatment if your symptoms are mild.

[00:32:24]50% chance of receiving the treatment if your symptoms are severe.

[00:32:32]90% chance of receiving the treatment if your symptoms are critical.

[00:32:39]And that is how I am generating a k samples. I am well I generate a random probability a function z value return and then I check for each z in z.

[00:32:55]Anyway once you have obtained uh a values as well now I want to construct y.

[00:33:03]Now look at this uh anyway structure change 75% chance of uh survival for mild symptoms that were untreated. 25% chance of survival for severe symptoms that were untreated. 10% chance of survival for critical symptoms that were untreated. And similarly probabil and use I generated my data.

[00:33:35]Hence my data was systematically generated and this is what my data generating algorithm translates to in my graph. This is what I want to do. So essentially now I want to intervene and cut the association between Z and A. If I fix the value of A, I should be able to do that. So what am I going to do? A I have accidentally revealed the solution altogether.

[00:34:06]Anyway to replace this with the coin flip I want treatment group assign control group assign like treat this is not determined by symptoms rather there is an equal chance for each of the three categories. So this is what I did. Policy 5% 50% 90%. By the way, they do not sum up to 100% because it's not a distribution. Uh all I'm saying is there is a 5% chance that a person is given a treatment if their symptoms were mild. There's a 95% chance that they were not given the treatment if their symptoms were mild and so on and so forth.

[00:34:52]50/50 replace. So 50% chance of uh getting the treatment for each kind of symptoms. And so I generate my data again.

[00:35:05]Now look at my updated graph.

[00:35:07]Probability of receiving the drug with respect to symptoms 50% each.

[00:35:16]And now this is what I will do. I will compute the expected value of y recovery given that do a equals 1. I intervene I fix the value of a to1. What would have happened if I would have administered this treatment minus the expected value of three of recovery by given that do a equals zero I intervene and I give no treatment.

[00:35:44]So let's see this is what I observe. Now this is interesting because it's go use if I again find the difference in means just by the way I will um emphasize again we are finding the difference in means right now if you look at our piece of code as well to filter out just by the way this is the same thing as writing this down because you're more familiar with that syntax that we've been covering. So anyway you filter out treatment true treatment false and then your mean values find. So now the difference in means comes out to be 0.276.

[00:36:29]Now I can see that the drug actually has a positive effect on survival. So people have a better chance of surviving if they are given the drug. to compare just using conditionals of conditional at calculate conditional average treatment effect using conditional probabilities the answer was negative.11 which meant the treatment had an adverse effect on individuals which was not actually true because I was observing this because I had a confounder the severity of symptoms severity of symptoms link cut with treatment assignment and I made the treatment assignment random then I observe that my average treatment effect is positive and that means that the treatment is in fact effective. Hence I want you to understand conditional probability your result it can be biased and I want you to pause over here for a second and think about when these two values are going to be equal. It has been tested before and it is a very important concept. When are these two going to be equal? Then when is conditioning going to give you the exact same result as randomization as intervening?

[00:37:55]So this happens when there are either no confounding variables or all confounding variables have already been controlled for. Just remember the difference between conditioning and acting or intervening as well. Conditioning you are just observing the data. So an example is that observing people who drink coffee tend to have higher rates of heart disease. This observation alone doesn't prove that coffee is what's causing heart disease because there could be other factors involved.

[00:38:28]Let me break it down a little further.

[00:38:31]This doesn't tell you that people who drink coffee tend to have heart disease.

[00:38:37]This tells you among people who have heart disease, a lot of them are drinking coffee are coffee drinkers. So this is just what happens to occur in your data. It just tells you what's already in your data. Acting intervening you can establish causal links because like this is like an RCT. Some people are assigned to drink coffee, some people are assigned to not drink coffee.

[00:39:07]Um, and then if you compare their heart disease rates, so then you can establish a causal link.

[00:39:17]Anyway, a few final words. Uh, when concluding data analysis, it is important to think about the nature of the data, how it could have been generated. uh if you fail to do so like um the way that we constructed our causal DAG and we immediately determined confounder results biased. So if you fail to do so, you would have concluded that your treatment was ineffective.

[00:39:41]While you have an effective solution to the rare with that we're going to move on to the last notebook that we will be going over in today's session and this covers multivariate regression. But a lot of you may not remember this is from lecture 12.

[00:40:00]uh when we were looking at average treatment effect, we saw that difference in means is one way of estimating the average treatment effect. Similarly, you have multivariate regression. For those of you who have already taken ML, you're already familiar with regression as well. For those of you who haven't, do not worry about it because regression is course cover in module 4 which we will be starting soon inshallah and then I'll hold a tutorial to go over that in code as well. Anyway, what do you do in regression analysis? You build a model of y where y is your outcome as a function of all the co-variates x and the treatment t. So I want to tell you co-variants exactly covariants are variables not always confounders by the way they are just other variables which are involved in your data generating algorithm in your causal graph. But these are variables that we are not interested in. We are interested in the treatment and the impact that the treatment has on the outcome. But when we try and fit a model for Y and then we add all the co-variants to that model and then we also add the treatment to that model. For instance, cholesterol you were trying to determine uh whether whether exercise is what is impacting cholesterol or not. And then you had a covariant age which was affecting cholesterol. Maybe exercise was affecting it. It was affecting exercise.

[00:41:36]So the equation you formed a regression equation for this. If you have other variables, you would just add them to this equation.

[00:41:45]Then when you adjust your model, you fit your model to your data. It will help you estimate.

[00:41:52]It will essentially help you uh incorporate all of your other co-variants and it will help you estimate the treatment effect. Let's look at this through an example. Unless you see an example, it's hard to be clear about this. So this is a case study borrowed from uh well borrowed from Dr. Hassan's offering. What are we looking at? A paradoxical collider effect. Collider effect. I will help you all recall what that is. The problem at hand is what is the causal effect of sodium intake on blood pressure. You are looking at good controls and you're looking at bad controls. We don't know which is which right now, but you are trying to determine the the causal effect of sodium intake on blood pressure. So research paper for those of you interested, you're welcome to explore this on your own.

[00:42:55]Let's go over the problem statement. In Southeast Asia, we often consume food with high salt content. Food with high salt content is going to impact blood pressure. The National Health Survey of Pakistan estimated that hypertension which is well high blood pressure affects 18% of Pakistani adults. It is well known that high blood pressure is associated with an increased risk of mortality, increased risk of death. So we are interested in finding the causal effect of sodium intake on blood pressure.

[00:43:34]Our data is coming from this paper. The outcome variable Y is the systolic blood pressure.

[00:43:41]This is a continuous variable. Then the treatment is sodium intake. Sodium intake is a binary variable.

[00:43:48]Data one store if the sodium intake was more than 3.5 mg and zero otherwise. So you know either high sodium intake or low sodium intake either zero or one.

[00:44:00]The measured covariants that are denoted by x there are two covariants over here.

[00:44:07]We're not commenting on the nature of these coariants. We just know that there are two coariantss right now. Um so you have the age and the amount of protein in the urine of an individual. We have data on 1 million individuals. In this example, you already know that suppose you know okay the true average treatment effect is equal to 1.05.

[00:44:34]Let's see.

[00:44:36]So let's look at our data set. This is what it looks like. 1 million rows, four variables recorded, blood pressure, sodium, age, and protein and ura.

[00:44:49]I will give you guys a few seconds to look at this data set and then we'll move on.

[00:45:29]All right, moving on with our data set.

[00:45:33]So what do I do in order to now I'm going to try and find the difference in means which as we know is an estimate of average treatment effect. In this example I'm already telling you average treatment effect that is equal to 1.05.

[00:45:53]Now I will find the difference in means and we already know how we're going to do that. I will create two new data frames, two groups. one where sodium equals 1, the other where sodium equals zero and though groups may I will compute the mean blood pressure and then take their difference.

[00:46:12]So this is what it comes out to be 133 for the group with high salt consumption 127 for the control group. So when I take the difference my AT comes out to be 5.32 but this is a large overestimate of my actual at ultimately it means there's some confounder involved but we're going to look at that later.

[00:46:40]So exactly anyway moving on towards your multivariate regression design. You're going to try and estimate the AT by controlling for different variables and you're controlling for what variables up add as covariants in your regression model. So y equals alpha 0 plus alpha 1 into sodium plus epsilon is a small value that goes into your regression model. Um this is your intercept and this is your slope for sodium. So AB I'm actually only considering two things the treatment and the outcome sodium and blood pressure.

[00:47:24]I'm not considering age and I'm not considering protein and data frame sodium filter out as x since no controls were added. So x is whatever is going on this side and then y is the outcome. Yu only store the blood pressure. Then I um instantiate a model linear regression. I fit the model using x and y. I copy my data frame and now against my data frame uh me sodium use to predict for um you know what I am doing over here model of it. After that I create two copies of my data frame. A copy I set all the sodium values to one.

[00:48:26]Copy I set all of the sodium values to zero. And now I try and determine sodium value one set for every single individual. You know I'm doing this for every single individual even though in actual and that was the limitation with the difference in means approach as well individuals data counterfactuals if an individual was consuming sodium. I had no idea to determine. I had no way of determining what would have happened if that same person had not consumed sodium.

[00:49:12]is helping you. I have a model patient sodium value treatment value outcome predict. So now I have all the outcomes if the treatment was one. I will do the same thing by setting the treatment to zero as well. And then the model will predict and tell me value zero to sub patients outcome when I take their difference. So it helps me uh find a better estimate of my average treatment effect without uh right now controlling for anything. This comes out to be 5.32 which is essentially fairly similar to a difference in means but to be fair the idea is to be able to include covariants and coariant include then suppose that I include age with sodium as well. So everything else remains the same except I'm including sodium, h I'm including both of them. Then I do the exact same thing and my AT estimate comes out to be 1.045 045 which is fairly close to the average at which means include now I will um try and I will actually comment on this later now I will try and include my other co-variant as well sodium which is a treatment I have age which is a co-variant and I have protein in ura which is also a co-variant so just this thing changes everything else remains the same and the AT with adjustment comes out to be a big underestimate of what I was actually trying to find now it's 0.848 848 is this was still closer to my actual answer.

[00:51:15]It was a big overestimate of my actual answer. So why are we observing this?

[00:51:21]Now I will comment on the nature of these co-ar sodium intake and where is this? Yeah, here we go. Sodium intake which is impacting blood pressure. I have age.

[00:51:35]Age is a confounder. age is impacting my sodium intake perhaps and then age is impacting uh the blood pressure. With age, people are more likely to have high blood pressure. This is just based off of our data. I'm not telling you older people are likely to consume more sodium or younger people are likely to consume more sodium.

[00:51:58]This is off of the data that was already provided to us. Age is a confounder. So age regression model age was automatically adjusted for it was important to adjust for age which is why my outcome it was no longer an overestimate of the actual average treatment effect.

[00:52:22]On the other hand, uh if a person is consuming more sodium, they are likely to have more protein in their urine association.

[00:52:33]And then if a in fact it's a fairly causal association. On the other hand, if a person has high blood pressure, they are also more likely to have protein in their ura. So the problem with this is okay now protein is actually acting as a collider. Both my treatment and my outcome have a common effect which is protein in urine. And this is my problem. You colliders you do not control for colliders because when you do control for colliders you introduce a collider bias. So regression equation collider.

[00:53:17]So now my result was an underestimate.

[00:53:22]All right, that will be all from my side. If you guys have any questions about this tutorial, please do reach out to me. Thank you.

Related Videos

Solving a 'Harvard' University entrance exam question

AsadInternationalAcademy

125 views•2026-06-14

Algorithms for Generalized Signed Distance and Winding Numbers (PhD thesis)

NicoleZFeng

269 views•2026-06-15

Notes 6.3 Rectangle, Rhombus, Square

matthewmills6952

1K views•2026-06-18

Does the math actually hold up? Let's break down the logic.

rawXopinion

1K views•2026-06-15

NYT Hard Sudoku Walkthrough | June 17, 2026

Rangsk

2K views•2026-06-17

Notes 11.5 Area of a Circle and Sector of a Circle

matthewmills6952

251 views•2026-06-18

Notes 4.2 Isosceles and Equilateral Triangles

matthewmills6952

444 views•2026-06-18

Can You Solve This?

brain_station_videos

1K views•2026-06-15

Trending

Nobel Scientist Creates Device to Harvest Water From Desert Air

DrBenMiles

2200K views•2026-06-16

GROW A GARDEN 2 UPDATE

KreekCraft

668K views•2026-06-20

উটের কুঁজের মধ্যে কি থাকে?

MrBonGrow

1861K views•2026-06-18

아픈데 손은 호강 중

Memody-q3b

5995K views•2026-06-14