Structural causal models (SCMs) use data-generating programs with exogenous (random) and endogenous (model-determined) variables to represent causal relationships. The key distinction between conditioning (P(H|W)) and intervention (P(H|do(W))) is crucial: conditioning selects existing data, while intervention actively changes the data-generating process by removing incoming arrows to the intervened variable. When P(H|W) equals P(H|do(W)), it indicates no causal relationship exists between the variables. This framework allows researchers to distinguish true causal effects from spurious correlations caused by confounding variables.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Tutorial 6
Added:okay there we go so basically as the first part of today's tutorial we are going to look at a case study it will not take us very long but essentially this is what we're trying to understand this was left in the last tutorial multivariate regression and how you can use it to control for certain confounding variables so this is the case study that we're going to look at for that matter there's an entire data generating our SL data loading process over here which I'm not going to go over rather we will look at the results and infer from what is given to us so anyway this is the problem statement in Southeast Asia we often consume foods with high salt content and high salt content is known to impact blood pressure the National Health survey of Pakistan estimated that hypertension affects 18% of Pakistani adults it is well known that high blood pressure is associated with increased risk of mortality so now we're interested in finding the causal effect of sodium intake means salt intake on the blood pressure of an individual the problem is taken from a paper which is referenced over here you may feel free to explore it on your own later if you'd like this is where we've essentially obtained our data from so the outcome variable is y and the outcome variable y represents the systolic blood pressure the systolic blood blood pressure normally is supposed to be around 120 then the treatment T is the sodium intake and essentially the sodium intake is just a binary variable if the sodium intake is more than 3.5 mg then T is 1 it means that the person has high sodium intake if it is less than 3.5 mg then it is uh zero which means me that the person has a sodium intake within whatever level it is that we are allowing them to all right so the measured co-variates okay so you have the treatment sodium intake and you have the outcome the systolic blood pressure and then there are co-variates as well the first co-variate is age because we think that age may be a may be impacting both and then the second co-variate is the amount of protein in the urine of an individual which has also been measured we are given the data of 1 million individuals this is a s this is a simulated example and you already know that the average treatment effect is equal to exactly 1.05 so let's try and hold on to that number as we move on all right I'm going to just show you guys what the data set looks like this is what it looks like 1 million rows across four different columns blood pressure which which is the um y basically the outcome and then sodium which is X sorry it's not X it is the treatment and then age and protein in UA are both representing X basically you can call it X1 and X2 because those are our two confounders all right so quickly to show you essentially we what we've done is we have split our data into two groups one group for which the sodium intake is high and then the other group for which the sodium intake is low so 0 one groups split and then we look at the blood pressure means of both of these groups for the first group that uh does not that has a sodium intake value just zero which means acceptable amount of sodium intake the average blood pressure of that group is actually higher than the group for which the sodium intake is high so that's a bit contradict tree our assumption was K the amount of sodium intake is going to have an adverse effect on blood pressure so salt intake High has so their blood pressure would probably be higher but when we look at just the means of this data set so we realize okay this is the opposite way and when you calculate the average treatment effect using difference in means so just like in cab difference ning that's 5.32 and we know that in actual it is supposed to be 1 05 so this is a very very bad estimate anyway uh the same thing has been visualized for you guys over here and with that moving on now essentially what we're going to try is we're going to try and estimate the average treatment effect by controlling for different variables remembering that the true value is actually 1.05 and we're going to control for different variables using regression so let me quickly walk you guys through what multivariate regression essentially is but as a part of this course in module 4 machine learning you guys are going to be studying this concept as well so multivariate regression is going to have whatever value you are predicting on one side suppose that is y and then Y is expressed in terms of a sequence of variables and it has an intercept for for instance y equals mx + C that is a simple line M would just be Alpha in this case Alpha is a coefficient so you know that Y is your outcome and then you are expressing the outcome in terms of a certain variable multiplied by the treatment plus a certain intercept and multivariate regression is not much different equation that is for simple linear regression multivariate regression me you are expressing one variable by which is your outcome variable in terms of multiple variables so multiple coefficients alpha 1 Alpha n and then Alpha T is the coefficient of the treatment essentially and you also add a constant term which is The Intercept to your regression equation so this is um an interesting model in machine learning which is going to map different input values to a certain output value so imagine three variables for for prediction and then sorry three variables as input and then there's one variable that you're trying to predict so essentially you are moving in a four-dimensional space and you're trying to observe four imp your model will basically fit your data to a certain equation and then it will calculate how it is performing on new data so essentially right now uh this is another example where you're trying to predict the cholesterol by using the age and exercise so it's probably something like for instance you're trying to determine the impact of exercise on cholesterol and age is a confounding variable so multivariate regression equation and you will train the model so by assigning a certain coefficient to age the model will control for age let's look at that through example now this is the initial regression model equation that we have formed alpha0 is just like I told you an intercept value alpha 1 is the parameter that you're assigning to sodium sodium is your treatment and then you can ignore this last term at the end that's just an error term that goes into your model essentially equation start here so you're trying to predict why which is the blood pressure of patients using just sodium which is just the treatment so essentially confounder consider we have age and we have the amount of um protein and UA as well but we're not considering those two right now um okay so let's quickly walk over what the cord is doing so essentially X store sodium key value and then why under I have loaded the blood pressure because I want want to predict essentially I want to predict y using X and that is blood pressure using just sodium I've initialized my model my model is a linear regression model there is simple linear regression in this case it's simple linear regression because variable input variable you're predicting the output if you had multiple variables over here for example like this or like this then it would be multi variate linear regression but it would still be linear regression though anyway so model do fit that is your command it trains your model and then what I have done is okay I have extracted two different data frames just me data frame I have isolated all the patients who have high sodium intake and then in my other data frame I have isolated all the patients that have low sodium intake and I have used my model which I trained over here to predict the blood pressures for both of those categories of patients and then I simply found the difference in means again all right this time it's with adjustment but I found the difference in means and it is 5.32 and if we quickly Circle back over here this was already 5.32 as well there might be slight differences but essentially because you didn't cater for any conf ing variables to begin with all right and this we know is still a big overestimate of our actual value of a of 1.05 so now what we're trying to do is we're going to try and introduce other confounding variables into the equation as well and see impact so let's see now this is my regression equation multivariate linear regression key equation where I have the sodium and I have the age so basically essentially Y is again blood pressure the Val the value that I'm trying to predict sodium is the treatment and age is a confounder so this is the only thing that has changed in my piece of code and the value that I obtained for at now is very close to our actual value of 1.5 now we obtain 1.44 6 it means k just by controlling for age we have almost reached the true at that we were trying to reach and why exactly does this happen because age is a confounder a confounder is something that affects both the treatment as well as the outcome so age is something that may affect the sodium intake for our data set it does and it's something that may affect blood pressure anyway all right anyway next let's try and train a model using both age and the amount of protein in Ura so I'm trying to introduce another co-variate as well and the answer that we obtain now is an underestimate of our actual answer of 1.05 answer 0.848 so I'm going to give you guys a few seconds here I'm going to pause and give you guys a few seconds to think about why this happened essentially shouldn't my value have came close have came closer to my actual value of 1.05 a or covariate introd absolutely and then you have used the same data set again as well to see trained model okay so again I'm going to pause for a few seconds and then we'll move on from this protein mediating so this is not a good control essentially you've basically almost gotten there anyway is causal directed Acy graph so we will realize that age is something that's maybe affecting both okay maybe with age like the sodium intake has increased or with age the blood pressure has has increased anyway but when we look at the protein in UA so protein in UA is probably not something that is affecting the blood pressure or it's something that's affecting the sodium intake probably actually your sodium intake had that affects the amount of UA in the amount of protein in Uria and similarly the blood pressure also impacts the amount of protein and Uria it means that in a causal directed acyclic graph this is a essentially a fork oh my bad sorry it is it is a confounder the prior one was a fork that is what makes it a bad control I'm not going to get into the depth of causal dags right now because we have that practice for our next tutorial inshallah but anyway when you have a causal dag given to you and you're trying to estimate the true treatment effect of a treatment on an outcome you condition on on or you control for a fork but you don't control for a compounder are there any questions at this point absolutely absolutely regression for certain variables so this is another way for controlling but it's still a matter of what variables should essentially be controlled for and what variables shouldn't be controlled for that will create a spurious correlation between your treatment and your outcome as well okay are there any other questions before we move on okay I'm going to leave this for you guys to go over it's the same thing you're just referencing to the coefficients of the regression model to look at the average treatment effect and with that we are going to move on to our next case study which concerns structural causal models these were covered in I think lecture number yeah lecture number 13 so let's see what we have over here in our last session we went over conditioning versus intervention we're going to do some probability math as well as go over a simple algorithm to see how this works too so suppose you have a computer program that generates a data distribution in a stepbystep manner I have this right over here something like this this is called a data generating program so we we have a data generating program right now why do we use a data generating program at this point because you want to know the precise causal mechanism you want to understand the causal relationships between different variables and this is important in observational studies you can change different lines of code as we looked at in our previous case study as well just basis you can change the data generating algorithm you can intervene basically and you can observe inter intervention impact on the outcome that's what we're going to look at again in our last session we went over a case study that covered Simpson's Paradox in in our case study essentially you already had a data set but you were trying to go from the data set towards understanding the data generating program or you were trying to use the data set to understand or to to estimate the causal impacts of different treatments on different outcomes all right this time we're going to do it the other way around we're going to have our data generating algorithm in front of us so I'm going to walk you guys through this algorithm you have three independent bruli random variables I'm going to clarify again what a bruli random variable means it's like a coin flip a bruli variable can only have one of the two binary values can either be zero or it can be one that's the first important thing the second important thing is it is always one with a certain probability which you define we can algorithmically calculate these as well we can use these to map on to other variables these are called exogenous variables or noise for that matter the reason for that is that these are outside the scope of your model for instance suppose you were trying to to determine the impact of smoking on cancer and then you have another confounding variable another a CO variant which is exercise or you have another co-variate which is age these are all the variables the co-variates the treat the treatment variable and the outcome variable all of these are a part of your model these are determined within your model but these variables U1 U2 u3 bran random variables are these are coin flipping variables value Zer one and these are outside your model like your model has nothing to do with this in fact use to generate values for our endogenous variables in this case we have three endogenous variables uh which are w x and H um okay yeah we have this is basically a mapping of the example covered in class class as well so if you guys would recall X determines whether an individual exercises or not why uh W refers to whether they are overweight or not and H refers to whether or not they have heart disease in our problem we are trying to determine whether being overweight leads to having heart diseases all right and then you have a co-variate over here which is whether individuals exercise or not because ex because exercising May determine whether an individual is overweight or not and exercising has an impact on whether individuals develop heart disease or not so you're going to assign values to x w and H based on your brly random variables as part of your data generating process using this process we have generated uh 10,000 individuals for this program so again let me give you an overview of the generating prodcts which we've already covered in class I have taken this exact example from class basically and then we're going to look at the code that generates data according to this algorithm for us okay so U1 is going to be one with a probability of 1 by two it means there is a 50% chance that U1 is going to be zero and a 50% chance that U1 is going to be one 1 U2 is going to be one with a probability of 1/ 3 which means that there is a 2 by Third chance of U2 being zero and u3 is exactly the same as U2 as well basically anyway so X is equal to U1 that simply means U1 value generate which is zero or one you are storing that in the value of x as well and we know that X represents whether an individual exercises or not so 50% chance of either and next we have W whether an individual is overweight or not so w is going to be zero if x is one otherwise if X is not one then W is going to be equal to U2 and we know if x is equal to zero then W key probability of being one would be the probability of U2 being equal to one which is 1 by3 and H works the same way as W except that it uses u3 instead of U2 so very quickly I want to show you the graph that we have as a result of this data generating algorithm as well this is the causal directed acyclic graph that we obtained so is under the variables that are left in Gray are again your exogenous variables these are not part of your model these were just used to generate your data and then green variables along with their arrows that is what defines your model right now so X impacts both W and H you may or may not add an arrow over here this is something that we're trying to determine whether W impacts H or not are there any questions up until this point okay moving on so let's look at how the data generating algorithm essentially Works basically I still have to run from the top so I'm going to quickly do that and meanwhile I will walk you through the code so let's see let's just look at first these three variables that we've initialized U1 returns np. random.
binomial this is a number Library function where you can specify n and n is basically the value that you want to predict with a certain probability P so essentially this will return to me a value of one with the probability of 0.5 otherwise it will return zero similarly U2 may you have 0.33 which is 1 by3 because we know one ke probability 1 by3 here and then for u3 as well you have 1x3 now I'm creating a data frame with these columns x w and H and now I am generating 1,000 rows based on sorry 10,000 rows based on the algorithm discussed so first I generate X which is just equal to U1 we know that and then I check if x equals 1 then W equals z just based off of this line otherwise else may I've written down that W = U2 similarly if xal 1 then H equals 0 otherwise H equals u3 and that is how I have created my data frame which looks like this all right now moving on so this is what we want to do now now we're given a joint distribution over the random variables x w and H and we're going to try and compute different V different probabilities of Interest and this time what the probabilities that we are Computing are not going to be by hand rather we will compute these probabilities using the generated data after that I'll quickly show you guys how you can do the exact same computation by hand based on your data generating algorithm again which is just a reiteration of something covered in class all right anyway so first of all you want to calculate the probability of H being equal to 1 and we know that H denotes whether a person has heart disease or not so for the general population we want to determine the probability of heart diseases all right so let's see if you directly compute this you get hold on just let me quickly rerun this again because I think something went wrong when I ran it halfway okay so the probability of H you have simply calculated it as the mean I'm going to give you guys a few seconds after this to think about why the mean gives us a direct mapping for the uh basically this value for the probability of H being equal to 1 for now we do know that this answer is16 76 which is approximately 1X 6 and we want it to be equal to approximately 1X 6 as well but we'll do that math later all right anyway take a few seconds think about why the mean function is directly giving me this probability oh all right would someone like to comment on this okay anyway so suppose entually series which you're obtaining so suppose this is an example of a series and I want to compute the probability of H being equal to one so I can simply count the number of ones which are 1 2 3 3 4 5 and 6 right and then I can take the length of H which is 3 6 and 9 so 6 over 9 or essentially 2 over 3 is the probability of H being equal to 1 and that's exactly how I've maed it over here as well so your mean works just fine for that because mean essentially values Su and then you will divide it by the number of values when I sum them up to obviously it's going to give me the count of ones because zeros impact and then when I divide it by the number of values that's obviously the length of age anyway so now we know that the probability of uh people having heart diseases in the general population is roughly 1/ 6 next we want to find the conditional probability of H equal 1 given that W equals 1 what exactly does this translate to I want to find the probability of people having heart diseases who are overweight all right so I want to see for those people who are overweight what is the probability that they have heart disease and I can calculate this using my data frame very easily so first what I've done is I have isolated all the rows where W equals 1 because that's my condition right W equals 1 so isolate W one and stored that in a separate data frame called condition data frame I've conditioned on W anyway after that from my conditioned data frame where W was equal to one I have store I have filtered out those rows where H equals to one as well so H data frame essentially we will have those rows where W is equal to 1 and H is equal to 1 and data frame conditional data frame this contains all the rows where W is equal to one right so now I can simply do this the length of this data frame H data frame where W equal 1 and H equal 1 divided by the length of this one we just W equals 1 and that gives me the conditional probability which comes out to be approximately 35.3% % which is approximately 1 over 3 are there any questions as far as this math is concerned all right moving on so now we see that essentially uh the probability of heart disease in the general population was 1 over six approximately and the probability of heart disease in people that are overweight is 1 over three roughly which is double the probability of heart diseases in the general population so can we immediately go to the outcome that overweight people are more likely to have heart attacks or to have heart disease that is the question that we're trying to answer however using these two values we cannot determine that because the association that we have seen between W and H right now is not causal and let me explain to you why that is not causal see if you come back to your graph over here just using the initial graph that is given to us that was mapped from the data generating algorithm is me I haven't made an error from W to H already because I've only mapped all of these two arrows so I know that U1 generates X and then I know since when you look at the third line and the fourth line so you see k w equals 0 when X is one and H equals 0 when X is 1 which automatically causes us to make these arrows we know that X is impacting the value of w as well as the value of H so right now I know that the impact that I am observing from W to H is because of X right now it is not possible to have the value of probability increased when I am conditioning on W simply because H ke value H value is being generated using X you can Qui of course sure first take a few seconds and just read through this part quickly then we can reiterate this all right I think that should be about good enough so I'm going to go over this again quickly when we look at our data generating process so we observe if you just look at the data generating process for H this last line so H is zero if x is one otherwise H is u3 so just looking at this we know H is determined by X and u3 W impact in generating the value of H that's why we didn't make an arrow from W to H and that's why we know impact because of the value of x because the value of x was used to generate the value of w and it was used to generate the value of x as well it means essentially value X say You observe aous correlation from W to H I hope that answers your question G okay perfect let's move on then I've just added this for your reference again to just circle back to that and now we are going to intervene and I'm going to clarify the difference between condition and intervention once again okay so let's see now you so basically first I'm going to tell you this is the probability that we have to calculate now first you were calculating the probability of H given that W is 1 that is called condition where was equal to one and then from those you looked at the rows where heart disease was true now we are going to look at the probability of H do W equals 1 which means we're going to intervene and fix the value of w to one essentially say let me just yes so this is what happened all right uh in brown oer you have your prior data generating algorithm okay now when you are calculating the probability of H equals 1 given that do W equals 1 it means that you are intervening and you are changing your data generating process you are actively setting the value of w21 so essentially is graph W dependency uh X sorry X depend yeah I I said it right W dependency X remove so you remove this arrow and you replace it with a coin flip this is how it changes your data generating algorithm now W is set to one I have changed this row basically that is the difference between conditioning and intervention and Beyond this point it's very simple actually now all all you have to do is you have to look at the new data generating algorithm which is written in blue and you're going to use that algorithm to calculate the value of the probability of H equals 1 given that W equals 1 so it gets pretty simple from here onward so first let me show you in code how we have changed this probability no no no not at all not at all because essentially concern if x is like um the exercise to exercise impact heart disease it is still there it is there it doesn't really matter to us because there could be a hundred other things affecting the heart disease and it doesn't matter to us I hope that answers your question treatment is the whether the person is overweight or not so exercise is be impact in our data set or exercise is be impact you have to remove such paths condition intering and fixing the value of w w value zero fix so it's no longer dependent on H by fixing the value of w to zero I can essentially compute the average treatment effect of w on H or rather I can see probability of a person developing heart disease given that they are not overweight or W one fix I can look at the probability of a person developing heart disease given that they are overweight so essentially by intervening and fixing the value of w May dependency is that all right absolutely exactly and that actually depends I know that a person is overweight if I fix it to zero I know that a person is not overweight we're going to try and do both of those actually okay with that moving on so now I want to change my algorithm to match this new data generating process and change e u3 change they are exactly the way that they were X is still the way that it was xal U1 and H is still the way that it was as well if x is one then H is zero otherwise H equals u3 and this is what I've changed now now W is equal to 1 I have intervened and I have fixed the value of w to 1 and I have generated another data frame now of this data frame May when I look at the different probabilities then the probability of X being equal to 1 is 49 the probability of H being 1 is which we already computed once as well approximately 1 by 6 but the probability of w being 1 is equal to 1 because W of he W this is what the do operator does for you and with that let's move on data generating algorithm change or w value fix by intervention now you can just compute this thing probability of H equals 1 given that W equals 1 and this process remains exactly the same as it was before now the result that I obtain is 0.1 1603 which is again approximately 1 over 6 so this is how you have intervened and found the probability of H equal 1 again simil Illy I can intervene over here set this value to zero and then see how that impacts the probability that I will find so let's give it a few seconds hopefully it should run fast okay for the sake of saving time I'm quickly going to run it again essentially while this code runs we can make an important observation though uh the initial probability of heart disease in the general population was 1 over six now when I have intervened and fixed the value of w to one like now I'm saying overweight people s so in that case as well the probability of heart disease is approximately 1 over 6 what does that tell us that tells us overweight on heart disease based on this data so now of course I have to change this to zero as well oh my bad because W zero fix and this is also approximately 1/ 6 and when I fixed this to one that was also approximately 1/ 6 so essentially that tells me that there is no impact of uh being overweight on the probability of developing heart disease when you use this to find the average treatment effect let me just show you over here as well this is what we did last time so probabilities calculate they both were approximately 1/ 6 and when you use these probabilities to find the estimated average treatment effect it is zero which tells us that there is no relationship between those two or let me clarify it tells us that there is no causal relationship between those two because Inter there is a 30 like 33% people did have heart disease that were overweight that was the difference but when we intervened we observed uh that that was just an association when we intervened to observe whether the association is causal or not we realize that the association is not causal and again reiterating that it makes complete sense because in our data generating algorithm now I'm not talking about real life real life may impact I'm sure but for our model through data generate H was never determined by W all right so I'm going to pause over here for a few seconds and let you guys think about one particular scenario first we'll go through this one of the most important takeaways right now should be that probability of H given that W is equal to 1 is generally not equal to the probability of H given that do W equals 1 but there is a case where these are equal as well so I want you to take a few seconds and think about that okay so essentially this has helped us establish okay um or rather this is not what I was talking about these are not generally equal this thing is uh sometimes right now we observed that it was equal to the probability of H for the general population as well so essentially what happens is if W actually has a causal effect on H if there is a causal relationship between these two then those probabilities may be equal as well on that note are there any questions before we move on to our next probability math practice question caal relationship confounding [Music] factors could you come again probability or prob no no I absolutely mistakenly started with the wrong question what I meant to say was probabilities equal observe this is generally not equal probabilities equal observe ke which was the probability of heart disease in the general population and the probability of heart disease given that do W equals 1 those two were equal in our case but those two are not always equal so my question was when are those two not equal to each other andal relationship exists soes and they are equal to each other you can instantly say that there is no Cal relationship between these two variables I hope that makes sense okay I'm going to re share my screen and we're going to move on to well I seem to have lost the one Note Tab okay just give me a few seconds there we go so just to reiterate very quickly I would like to go over this problem as well with regards to probability and then I have another example too so first of all you were required to calculate I'm not going to find the probability of heart disease in the general population or rather let's do that actually because we we can build off of it so suppose you have to find the probability that H equals 1 so let's see if H equals 1 it means that X has to be zero that is the first thing I can decide because if x was equal to one then H could never be one then H would have definitely been zero so this means that the probability of x equal 0 and there is one more thing as well so suppose H one and if H is one it means that X is certainly zero and then if x is zero then H is equal to u3 right so u3 must be equal to 1 as well in order for H to be one so this is what I'm trying to find all right so probability of xal 0 and u3 = 1 hold on just give me a few seconds okay anyway so let's see the probability of X being equal to zero X depends on U1 basically so this means that U1 equals 0 because if U1 is zero then X will be equal to zero and then I can multiply this with the probability that u3 equal 1 now U one probability for zero and we know that there's a 50% chance it's one so of course there's a 50% chance that it is zero as well for u3 the chances of it being one are 1 by three and that is how you obtain one over six all right which is exactly the value that we found uh well not exactly but like value find key in our notebook using the data that we had generated using our data generating algorithm it was very close to 1/ 6 as well so now I'm showing you how you can obtain the same thing using your probability math just by looking at the data generating algorithm as well because in an exam if you were given a data generating algorithm you are not going to be able to run it obviously so you should be able to do this directly as well so now suppose you have to find the probability that H = 1 given that W is 1 this is the next value that we found importantly so let's see what we can do over here now your data generating algorithm has been changed this is what you have over here all right so oh yeah yeah you're absolutely right I skipped ahead you are right let's hold on to this for our next question absolutely right so w value fix by intervention rather we're just observing from our given algorithm so if W is equal to 1 that means that X cannot be equal to one right is X must be equal to zero that's the first important thing for w to be equal to one the second important thing is that U2 must be equal to 1 so this is what we have been able to find so I'm just going to write this down over here according to our conditional probability equation this basically means probability of H equals 1 and W equal 1 divided by the probability of w equals 1 we essentially do not need to go all the way here but I have found that students sometimes struggle with this as well so just to reiterate how this works so let's see now H equals 1 and W equals 1 so for H equals 1 we have so for H equals 1 essentially you have to ensure that X must be equal to zero and u3 must be equal to well u3 must be equal to 1 so I'm going to write this down over here into the probability of u3 = 1 into the probability of x = 0 into the probability of U 2 = 1 let's see where I'm going with this so n you have the probability of w being equal to 1 and we know that in order for w to be equal to 1 again X has to be equal to zero and we know that U2 has to be equal to 1 iPad so essentially there are some things over here that I can cancel out you can cancel out this and this and then you can cancel out this and this so essentially this is what you are left with now the probability of I'm just going to get rid of this real quick oh my bad we we absolutely I think do not write this thing down twice and that is how that cancels out so you're only left with the probability of u3 being equal to 1 and we know that u3 is equal to 1 with a probability of 1 over 3 and approximately 33.3% is the answer that we obtained for that data generating algorithm as well so with that I'm going to add our final important example as well let's see now we're going to intervene and now I'm going to set the value of w21 essentially changing my data generating algorithm so you have to find the probability that H equals 1 given that do W equal 1 and let's see so now the thing is W one set but if you look at the data generating process for for h it doesn't depend on W at all H depends on X and it depends on u3 and we know u3 very random variable what does X depend on X depends on U1 if x depends on U1 u3 doesn't depend on anything else it means H key dependence right now it's only on U1 and on U three H has no dependence on W whatsoever so this intervention shouldn't make a difference for us all right anyway so let's try and calculate it in order for H to be equal to one again we know that X must be equal to 1 and I'm actually not going to write down this entire process again we know that U1 must be equal to 1 and then we know that u3 must be equal to 1 so essentially 1 / 2 into 1 over3 the exact same math that we did before it gives us 1 over 6 something would have changed in our Ali either essentially just means directly indirectly so in conclusion fixing the value of w made no difference to H whatsoever I could have fixed this to zero and it still would have given me the exact same answer are there any questions at this point orid H 1 or wal probabil essentially the probability of H equals 1 and W equals 1 indivually I realized W One X has to be zero and u3 has to be one correct G and in order for H to be equal to one um hold on uh this was for H right for in order for w to be equal to one X be zero yeah so in order for w to be equal to one you have to ensure that X is zero and that U2 is equal to 1 soals to zero common here that's why I wrote it down only once okay because and so you have to see the probability that H equals to 1 and W equals to 1 so simple probability just meant u z probability I hope that makes sense so much okay are there any other questions at this point okay so in order to um help you guys out with practice I will attach an another probability KA for a data generating algorithm against its Solutions and I would highly recommend that everyone tries to attempt it on their own and uh I think one should be available in your homework as well and then you should be good to go that will be all for today thank you
Related Videos
LBF101 Creating an XML Changelog
liquibase7511
3K views•2026-06-15
Alta Labs Cloud Dashboard Real time Network & Xnet Insights!
ShinyTechThings
158 views•2026-06-17
Wait... Group Policy Not Applying? Check This First!
keeplearning_iT
144 views•2026-06-15
Leetcode Weekly Contest 506 | Life's boring these days
Pudeesht
2K views•2026-06-14
microJAM: MAKING A MICRO GAME FOR A GAME JAM IN CLOJURESCRIPT AND TOTALLY NOT C
janetacarr
156 views•2026-06-18
Partitioning vs Bucketing vs Clustering: How to Make Queries 100x Faster
thedataandaiguy
194 views•2026-06-16
Design Claude Code Like a Senior Engineer
hayk.simonyan
344 views•2026-06-19
Linus Torvalds: AI Won’t Replace Understanding Code
SavvyNik
140 views•2026-06-19











