d_separation - Obsidian Publish

# D-separation ## Overview D-separation (directional separation) is a criterion for determining conditional independence relationships in [[bayesian_networks|Bayesian networks]]. It provides a graphical method to determine whether two sets of nodes are conditionally independent given a third set, based on the structure of the directed acyclic graph. ## Mathematical Foundation ### 1. Path Blocking A path p is blocked by a set of nodes Z if and only if: 1. p contains a chain i → m → j or a fork i ← m → j where m ∈ Z, or 2. p contains a collider i → m ← j where m ∉ Z and no descendant of m is in Z ### 2. D-separation Definition ```math \text{d-sep}_G(X;Y|Z) \iff \text{all paths between X and Y are blocked by Z} ``` where: - X, Y are sets of nodes - Z is the conditioning set - G is the directed acyclic graph ### 3. [[conditional_independence|Conditional Independence]] Implication ```math \text{d-sep}_G(X;Y|Z) \implies X \perp Y | Z ``` ## Implementation ### 1. Basic D-separation Check ```julia struct Path nodes::Vector{Node} edges::Vector{DirectedEdge} end function is_d_separated(graph::BayesianNetwork, X::Set{Node}, Y::Set{Node}, Z::Set{Node}) # Find all paths between X and Y paths = find_paths(graph, X, Y) # Check if all paths are blocked return all(is_path_blocked(path, Z) for path in paths) end function is_path_blocked(path::Path, Z::Set{Node}) for i in 1:length(path.nodes)-2 triple = path.nodes[i:i+2] if is_chain(triple) || is_fork(triple) # Check if middle node is in Z if triple[2] ∈ Z return true end elseif is_collider(triple) # Check if middle node or its descendants are not in Z if !(triple[2] ∈ Z || any(d ∈ Z for d in descendants(triple[2]))) return true end end end return false end ``` ### 2. Path Finding ```julia function find_paths(graph::BayesianNetwork, X::Set{Node}, Y::Set{Node}) paths = Vector{Path}() for x in X, y in Y # Use depth-first search to find paths dfs_paths!(graph, x, y, Path(), paths) end return paths end function dfs_paths!(graph::BayesianNetwork, current::Node, target::Node, current_path::Path, paths::Vector{Path}) # Add current node to path push!(current_path.nodes, current) if current == target # Found a path push!(paths, deepcopy(current_path)) else # Continue search for neighbor in neighbors(graph, current) if neighbor ∉ current_path.nodes dfs_paths!(graph, neighbor, target, current_path, paths) end end end # Backtrack pop!(current_path.nodes) end ``` ### 3. Triple Classification ```julia function is_chain(triple::Vector{Node}) return has_edge(triple[1], triple[2]) && has_edge(triple[2], triple[3]) end function is_fork(triple::Vector{Node}) return has_edge(triple[2], triple[1]) && has_edge(triple[2], triple[3]) end function is_collider(triple::Vector{Node}) return has_edge(triple[1], triple[2]) && has_edge(triple[3], triple[2]) end ``` ## Applications ### 1. [[independence_testing|Independence Testing]] ```julia function test_independence(graph::BayesianNetwork, X::Set{Node}, Y::Set{Node}, Z::Set{Node}) # Check d-separation if is_d_separated(graph, X, Y, Z) return true end # If not d-separated, test conditional independence return test_conditional_independence( graph.distribution, X, Y, Z ) end ``` ### 2. [[markov_blanket|Markov Blanket]] Identification ```julia function find_markov_blanket(graph::BayesianNetwork, node::Node) blanket = Set{Node}() # Add parents union!(blanket, parents(graph, node)) # Add children children_set = children(graph, node) union!(blanket, children_set) # Add children's other parents for child in children_set union!(blanket, setdiff(parents(graph, child), Set([node]))) end return blanket end ``` ### 3. [[causal_inference|Causal Inference]] ```julia function identify_causal_effect(graph::BayesianNetwork, treatment::Node, outcome::Node) # Find adjustment set using d-separation adjustment_set = find_adjustment_set(graph, treatment, outcome) if isnothing(adjustment_set) return "Causal effect not identifiable" end return adjustment_set end function find_adjustment_set(graph::BayesianNetwork, treatment::Node, outcome::Node) # Find all paths paths = find_paths(graph, Set([treatment]), Set([outcome])) # Find minimal set that blocks all backdoor paths return find_minimal_blocking_set(graph, paths) end ``` ## Theoretical Results ### 1. [[faithfulness|Faithfulness]] ```julia function is_faithful(graph::BayesianNetwork, distribution::Distribution) # Check if all conditional independencies # are implied by d-separation for X in subsets(nodes(graph)) for Y in subsets(nodes(graph)) for Z in subsets(nodes(graph)) if is_conditionally_independent(distribution, X, Y, Z) != is_d_separated(graph, X, Y, Z) return false end end end end return true end ``` ### 2. [[completeness|Completeness]] ```julia function is_complete(graph::BayesianNetwork) # Check if graph captures all possible independencies for X in subsets(nodes(graph)) for Y in subsets(nodes(graph)) for Z in subsets(nodes(graph)) if !implies_independence(graph, X, Y, Z) return false end end end end return true end ``` ### 3. [[minimality|Minimality]] ```julia function is_minimal(graph::BayesianNetwork) # Check if removing any edge creates invalid independence for edge in edges(graph) graph_minus_edge = remove_edge(graph, edge) if !violates_dependencies(graph_minus_edge) return false end end return true end ``` ## Best Practices ### 1. Algorithm Design - Use efficient path finding - Cache intermediate results - Implement early stopping - Handle cycles properly ### 2. Implementation - Optimize graph traversal - Use appropriate data structures - Handle special cases - Validate inputs ### 3. Testing - Verify path blocking - Check edge cases - Test with complex graphs - Validate independence implications ## References 1. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems 2. Geiger, D., Verma, T., & Pearl, J. (1990). Identifying Independence in Bayesian Networks 3. Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models 4. Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, Prediction, and Search 5. Pearl, J. (2009). Causality: Models, Reasoning, and Inference