Artificial Intelligence and Machine Learning: Unit II: Probabilistic Reasoning

Causal Networks

Probabilistic Reasoning - Artificial Intelligence and Machine Learning

A directed network which illustrates the causal dependencies of all the components in the network. A causal relationship exists when one variable in a data set has a direct influence on another variable.

Causal Networks

A directed network which illustrates the causal dependencies of all the components in the network.

A causal relationship exists when one variable in a data set has a direct influence on another variable. Thus, one event triggers the occurrence of another event. A causal relationship is also referred to as cause and effect.

The ability to identify truly causal relationships is fundamental to developing impactful interventions in medicine, policy, business, and other domains.

Often, in the absence of randomised control trials, there is a need for causal inference purely from observational data. However, in this case the commonly known fact that - 'correlation does not imply causation' distinguish between events that cause specific outcomes and those that merely correlate. One possible explanation for correlation between variables where neither causes the other is the presence of confounding variables that influence both the target and a driver of that target. Unobserved confounding variables are severe threats when doing causal inference on observational data.

A causal generalization, e.g., that smoking causes lung cancer, is not about an particular smoker but states a special relationship exists between the property of smoking and the property of getting lung cancer. As a causal statement, this says more than that there is a correlation between the two properties.

Some causal conditions are necessary conditions: The presence of oxygen is a necessary condition for combustion; in the absence of oxygen there is no combustion. "Cause" is often used in this sense when the elimination of the cause is sought to eliminate the effect (what's causing the pain?)

Some causal conditions are sufficient conditions: The presence of a sufficient condition the effect must occur (being in temperature range R in the presence of oxygen is sufficient for combustion of many substances. "Cause" is often used in this sense when one seeks to produce the effect (What causes this metal to be so strong?)

Looking for special circumstances: what was the cause of the fire? Oxygen? or an arsonist's match?

Causes are sometimes said to be INUS conditions in that they are Insufficient but Necessary parts of an unnecessary but sufficient set of conditions for the effect. Striking a match may be said to be a cause of its lighting. Suppose there is some set of conditions that is sufficient for a match's lighting. This might include the presence of oxygen, the appropriate chemicals in the matchhead and the striking. The striking can be said to be a necessary part of this set (though insufficient by itself) because without the striking among those other conditions the match would not have lit. But the set itself, though sufficient, is not necessary because other sets of conditions could have produced the lighting of the match.

Statisticians are careful to distinguish between two different interpretations of relationship - correlation and causal. Every successful prediction model Y ~ X is ademonstration that there is a correlation between the response Y and the explanatory variable X.1313 "Successful" means that the prediction performance of the model is better than the performance of a no-input model. But the performance of the model does not itself tell us that X causes Y in the real world. There are other possible configurations that will produce a correlation between X and Y. For instance, both X and Y may themselves have a common cause C without X being otherwise related to Y. In such a circumstance, a real-world intervention to change X will have no effect on Y. To put this in the form of a story, consider that the start of the school year and leaves changing color are correlated. But an intervention to start the school year in mid-winter will not result in leaves changing color. There's a common cause for the school year and colorful  folliage that produces the relationship: the end of summer.

Structural Causal Models (SCMs)

Structural causal models represent causal dependencies using graphical models that provide an intuitive visualisation by representing variables as nodes and relationships between variables as edges in a graph.

SCMS serve as a comprehensive framework unifying graphical models, structural equations, and counterfactual and interventional logic.

Graphical models serve as a language for structuring and visualising knowledge about the world and can incorporate both data-driven and human inputs.

Counterfactuals enable the articulation of something there is a desire to know, and structural equations serve to tie the two together.

SCMs had a transformative impact on multiple data-intensive disciplines (e.g. epidemiology, economics, etc.), enabling the codification of the existing knowledge in diagrammatic and algebraic forms and consequently leveraging data to estimate the answers to interventional and counterfacutal questions.

Bayesian Networks are one of the most widely used SCMs and are at the core of this library.

Bayesian Networks (BNs)

1. Directed Acyclic Graph (DAG)

A graph is a collection of nodes and edges, where the nodes are some objects, and edges between them represent some connection between these objects. A directed graph, is a graph in which each edge is orientated from one node to another node. In a directed graph, an edge goes from a parent node to a child node. A path in a directed graph is a sequence of edges such that the ending node of each edge is the starting node of the next edge in the sequence. A cycle is a path in which the starting node of its firstedge equals the ending node of its last edge. A directed acyclic graph is a directed graph that has no cycles.

Bayesian Networks

Bayesian networks are probabilistic graphical models that represent the dependency structure of a set of variables and their joint distribution efficiently in a factorised way.

Bayesian network consists of a DAG, a causal graph where nodes represents random variables and edges represent the the relationship between them, and a conditional probability distribution (CPDs) associated with each of the random variables.

If a random variable has parents in the BN then the CPD represents\(P(\text{variablel parents}) \) i.e. the probability of that variable given its parents. In the case, when the random variable has no parents it simply represents \(P(\text{variable}) \) i.e. the probability of that variable.

Even though if one is interested in the joint distribution of the variables in the graph, Bayes' rule requires to only specify the conditional distributions of each variable given its parents.

The links between variables in Bayesian Networks encode dependency but not necessarily causality. Here, the interest is in the case where Bayesian Networks are causal. Hence, the edge between nodes should be seen as a cause -> effect relationship.

Since BNs themselves are not inherently causal models, the structure learning algorithms on their own merely learn that there are dependencies between variables. A useful approach to the problem is to first group the features into themes and constrain the search space to inspect how themes of variables relate. If there is further domain knowledge available, it can be used as additional constraints before learning a graph algorithmically.

Steps for working with a Bayesian Network

BN models are built in a multi-step process before they can be used for analysis. 

1. Structure learning: The structure of a network describing the relationships between variables can be learned from data, or built from expert knowledge.

2. Structure review: Each relationship should be validated, so that it can be asserted to be causal. This may involve flipping / removing / adding learned edges, or confirming expert knowledge from trusted literature or empirical beliefs.

3. Likelihood estimation: The conditional probability distribution of each variable given its parents can be learned from data. ibo sanit (ol

4. Prediction and inference: The given structure and likelihoods can be used to make predictions, or perform observational and counterfactual inference.

Review Questions

1. Derive Baye's theorem of probability. Explain with suitable example its use in expert system. (Refer section 7.1)

2. What do you mean by probabilistic reasoning? Give an exmaple.

OR

Explain the probabilistic reasoning. (Refer section 7.1)

3. State and prove Baye's theorem. (Refer section 7.1)

OR

Write a short note on Baye's theorem.

4. Explain the term theorem providing and inferencing with examples. (Refer section 7.3)

5. Explain fuzzy logic with examples.

OR

Write short note on fuzzy logic. (Refer section 7.4)

6. Draw fuzzy curve for tall, short, very tall. (Refer section 7.4)

7. Write short note on : Fuzzy logic. (Refer section 7.4)    

8. Explain how Fuzzy logic is beneficial over classical probability theory? (Refer section 7.4)

9. Explain the forward and backward reasoning.

OR

Explain forward and backward reasoning with examples.

OR

Explain reasoning with example.

OR

Differentiate between forward and backward reasoning. (Refer section 7.3)

Artificial Intelligence and Machine Learning: Unit II: Probabilistic Reasoning : Tag: : Probabilistic Reasoning - Artificial Intelligence and Machine Learning - Causal Networks