# Variational Free Energy
## Overview
Variational Free Energy (VFE) is a fundamental quantity in active inference that provides a measure of the discrepancy between an agent's beliefs and reality. It serves as a unified objective function for perception, learning, and action selection in both discrete and continuous time formulations.
## Mathematical Framework
### Core Definition
The Variational Free Energy is defined as:
```math
F[q] = \mathbb{E}_q[\ln q(s) - \ln p(o,s)]
```
where:
- $q(s)$ is the variational density over states
- $p(o,s)$ is the generative model
- $\mathbb{E}_q$ denotes expectation under $q$
### Discrete Time Formulation
In discrete time, VFE decomposes into:
```math
F_t = \underbrace{\text{KL}[q(s_t)||p(s_t|o_{1:t})]}_{\text{Accuracy}} - \underbrace{\ln p(o_t|o_{1:t-1})}_{\text{Complexity}}
```
### Continuous Time Extension
The continuous time formulation extends VFE to:
```math
F[q] = \int_t^{t+T} \left[\mathbb{E}_q[\ln q(s(\tau)) - \ln p(o(\tau),s(\tau))] + \frac{1}{2}\mathbb{E}_q[(\dot{s} - f(s,a))^T\Gamma(\dot{s} - f(s,a))]\right] d\tau
```
## Connection to Policy Selection
### Discrete Time Policy Selection
Policy selection in discrete time uses VFE through:
```math
P(\pi) = \sigma(-\gamma G(\pi))
```
where $G(\pi)$ is the expected free energy:
```math
G(\pi) = \sum_{\tau} \mathbb{E}_{q(o_\tau,s_\tau|\pi)}[\ln q(s_\tau|\pi) - \ln p(o_\tau,s_\tau|\pi)]
```
### Continuous Time Policy Selection
In continuous time, policy selection becomes:
```math
P(\pi) = \sigma(-\gamma \int_t^{t+T} \mathcal{L}_\pi(s(\tau), \dot{s}(\tau), a(\tau)) d\tau)
```
where $\mathcal{L}_\pi$ is the policy-specific Lagrangian.
## Implementation Framework
### 1. Variational Free Energy Computer
```python
class VFEComputer:
"""Computes Variational Free Energy in both discrete and continuous time"""
def __init__(self):
self.components = {
'discrete': DiscreteVFE(),
'continuous': ContinuousVFE(),
'policy': PolicyVFE()
}
def compute_vfe(self,
beliefs: Distribution,
observations: np.ndarray,
time_mode: str = 'discrete') -> float:
"""Compute VFE based on time mode"""
if time_mode == 'discrete':
return self.components['discrete'].compute(
beliefs, observations)
else:
return self.components['continuous'].compute(
beliefs, observations)
def compute_policy_vfe(self,
policy: Policy,
horizon: int,
time_mode: str = 'discrete') -> float:
"""Compute policy-specific VFE"""
return self.components['policy'].compute(
policy, horizon, time_mode)
```
### 2. Belief Updating
```python
class BeliefUpdater:
"""Updates beliefs using VFE minimization"""
def __init__(self):
self.vfe = VFEComputer()
self.optimizer = VariationalOptimizer()
def update_beliefs(self,
current_beliefs: Distribution,
observation: np.ndarray,
time_mode: str = 'discrete') -> Distribution:
"""Update beliefs by minimizing VFE"""
def objective(params):
proposed_beliefs = self.construct_beliefs(params)
return self.vfe.compute_vfe(
proposed_beliefs, observation, time_mode)
optimal_params = self.optimizer.minimize(objective)
return self.construct_beliefs(optimal_params)
```
### 3. Policy Selection
```python
class PolicySelector:
"""Selects policies using VFE-based evaluation"""
def __init__(self):
self.vfe = VFEComputer()
def select_policy(self,
policies: List[Policy],
horizon: int,
time_mode: str = 'discrete') -> Policy:
"""Select policy by minimizing expected VFE"""
policy_vfes = []
for policy in policies:
vfe = self.vfe.compute_policy_vfe(
policy, horizon, time_mode)
policy_vfes.append(vfe)
return policies[np.argmin(policy_vfes)]
```
## Advanced Concepts
### 1. Hierarchical VFE
The hierarchical extension of VFE:
```math
F_{\text{hierarchical}} = \sum_{l=1}^L F_l + \text{KL}[q_l(s_l)||p_l(s_l|s_{l+1})]
```
### 2. Amortized VFE
Using neural networks for efficient VFE computation:
```python
class AmortizedVFE:
def __init__(self):
self.encoder = ProbabilisticEncoder()
self.decoder = ProbabilisticDecoder()
def compute_amortized_vfe(self, x):
"""Compute VFE using amortized inference"""
q_params = self.encoder(x)
p_params = self.decoder(q_params)
return self.compute_elbo(x, q_params, p_params)
```
### 3. Stochastic VFE
Extension to stochastic dynamics:
```math
F_{\text{stochastic}} = F + \mathbb{E}_q[\frac{1}{2}\text{tr}(D\nabla^2\ln q)]
```
## Planning with VFE
### 1. VFE-based Planning
```python
class VFEPlanner:
"""Plans actions using VFE minimization"""
def __init__(self):
self.vfe = VFEComputer()
self.trajectory_optimizer = TrajectoryOptimizer()
def plan_trajectory(self,
initial_state: np.ndarray,
goal_state: np.ndarray,
horizon: int) -> Trajectory:
"""Plan trajectory by minimizing VFE"""
def objective(trajectory):
return self.vfe.compute_trajectory_vfe(
trajectory, goal_state)
return self.trajectory_optimizer.optimize(
objective, initial_state, horizon)
```
### 2. Active Inference Planning
```python
class ActiveInferencePlanner:
"""Implements active inference planning using VFE"""
def __init__(self):
self.vfe = VFEComputer()
self.policy_selector = PolicySelector()
def plan_actions(self,
beliefs: Distribution,
policies: List[Policy],
horizon: int) -> List[Action]:
"""Plan actions using active inference"""
selected_policy = self.policy_selector.select_policy(
policies, horizon)
return self.extract_action_sequence(
selected_policy, beliefs)
```
## Applications
### 1. Perception
- Belief updating
- State estimation
- Parameter learning
### 2. Action
- Policy selection
- Motor control
- Decision making
### 3. Learning
- Model learning
- Skill acquisition
- Habit formation
## References
- [[friston_2010]] - "The free-energy principle: a unified brain theory?"
- [[bogacz_2017]] - "A tutorial on the free-energy framework for modelling perception and learning"
- [[buckley_2017]] - "The free energy principle for action and perception: A mathematical review"
## See Also
- [[active_inference]]
- [[expected_free_energy]]
- [[path_integral_free_energy]]
- [[belief_updating]]
- [[policy_selection]]
## Theoretical Foundations
### Connection to Free Energy Principle
The relationship between VFE and the Free Energy Principle can be expressed through:
```math
\begin{aligned}
& \text{1. Existence:} \\
& F_{\text{existence}} = \mathbb{E}_Q[\ln Q(s) - \ln P(o,s)] \geq -\ln P(o) \\
& \text{2. Boundary:} \\
& F_{\text{markov}} = \mathbb{E}_Q[\ln Q(μ,b) - \ln P(μ,b,η)] \\
& \text{3. Dynamics:} \\
& \dot{F} = -\frac{\partial F}{\partial s}^T \Gamma \frac{\partial F}{\partial s} \leq 0
\end{aligned}
```
### Information Geometric Structure
The geometry of VFE manifolds:
```math
\begin{aligned}
& \text{Fisher Metric:} \\
& g_{ij} = \mathbb{E}_Q\left[\frac{\partial \ln Q}{\partial θ_i}\frac{\partial \ln Q}{\partial θ_j}\right] \\
& \text{Natural Gradient:} \\
& \dot{θ} = -g^{-1}\frac{\partial F}{\partial θ} \\
& \text{Geodesic Flow:} \\
& \ddot{θ}^i + \Gamma^i_{jk}\dot{θ}^j\dot{θ}^k = 0
\end{aligned}
```
### Statistical Physics Connection
The thermodynamic interpretation:
```math
\begin{aligned}
& F = U - TS \\
& U = \mathbb{E}_Q[E(s)] \\
& S = -\mathbb{E}_Q[\ln Q(s)] \\
& \beta = \frac{1}{T} = \text{precision}
\end{aligned}
```
## Advanced Implementation Frameworks
### 1. Hierarchical VFE Computer
```python
class HierarchicalVFE:
"""Computes hierarchical VFE across multiple scales"""
def __init__(self, n_levels: int):
self.n_levels = n_levels
self.level_computers = [
VFEComputer() for _ in range(n_levels)
]
self.level_couplings = [
LevelCoupling() for _ in range(n_levels-1)
]
def compute_hierarchical_vfe(
self,
beliefs: List[Distribution],
observations: List[np.ndarray]
) -> Tuple[float, dict]:
"""Compute VFE across hierarchy"""
# Level-wise computation
level_vfes = []
for l in range(self.n_levels):
vfe = self.level_computers[l].compute(
beliefs[l], observations[l])
level_vfes.append(vfe)
# Coupling computation
coupling_terms = []
for l in range(self.n_levels-1):
coupling = self.level_couplings[l].compute(
beliefs[l], beliefs[l+1])
coupling_terms.append(coupling)
# Total VFE
total_vfe = sum(level_vfes) + sum(coupling_terms)
metrics = {
'level_vfes': level_vfes,
'coupling_terms': coupling_terms
}
return total_vfe, metrics
```
### 2. Information Geometric Optimizer
```python
class InfoGeometricVFE:
"""Optimizes VFE using information geometry"""
def __init__(self):
self.metric_computer = FisherMetric()
self.connection_computer = LeviCivitaConnection()
self.geodesic_solver = GeodesicFlow()
def optimize_vfe(
self,
initial_beliefs: Distribution,
observations: np.ndarray,
n_steps: int
) -> Distribution:
"""Optimize VFE using natural gradient"""
current_beliefs = initial_beliefs
for _ in range(n_steps):
# Compute Fisher metric
metric = self.metric_computer.compute(
current_beliefs)
# Compute connection coefficients
connection = self.connection_computer.compute(
current_beliefs, metric)
# Compute VFE gradient
grad_vfe = self.compute_vfe_gradient(
current_beliefs, observations)
# Natural gradient step
natural_grad = solve(metric, grad_vfe)
# Geodesic update
current_beliefs = self.geodesic_solver.step(
current_beliefs,
natural_grad,
connection)
return current_beliefs
```
### 3. Stochastic VFE Dynamics
```python
class StochasticVFE:
"""Implements stochastic VFE dynamics"""
def __init__(self):
self.sde_solver = StochasticDifferential()
self.noise_generator = NoiseProcess()
self.drift_computer = DriftField()
def simulate_vfe_dynamics(
self,
initial_state: np.ndarray,
time_span: float,
dt: float,
temperature: float
) -> np.ndarray:
"""Simulate stochastic VFE dynamics"""
def drift(x, t):
return -self.drift_computer.compute_field(x)
def diffusion(x, t):
return np.sqrt(2 * temperature)
trajectory = self.sde_solver.solve(
drift,
diffusion,
initial_state,
time_span,
dt)
return trajectory
```
## Advanced Mathematical Bridges
### 1. Path Integral to VFE Bridge
The connection between path integral formulation and VFE:
```math
\begin{aligned}
& \text{Path Integral VFE:} \\
& F_{\text{PI}}[q] = \int \mathcal{D}[s(\tau)] q[s(\tau)] \left(\ln q[s(\tau)] - \ln p[s(\tau),o(\tau)]\right) \\
& \text{Discrete-Continuous Bridge:} \\
& F_{\text{bridge}} = \lim_{\Delta t \to 0} \sum_t F_t\Delta t = \int_0^T F(\tau)d\tau \\
& \text{Action-Value Relationship:} \\
& S[s(\tau)] = \beta \int_0^T \mathcal{L}(s,\dot{s},t)d\tau = -\ln p[s(\tau)]
\end{aligned}
```
### 2. Information Geometric Bridge
```math
\begin{aligned}
& \text{Fisher-Rao Metric:} \\
& g_{\mu\nu}(θ) = \mathbb{E}_{q_θ}\left[\frac{\partial \ln q_θ}{\partial θ^\mu}\frac{\partial \ln q_θ}{\partial θ^\nu}\right] \\
& \text{Natural Gradient Flow:} \\
& \dot{θ}^\mu = -g^{\mu\nu}(θ)\frac{\partial F}{\partial θ^\nu} \\
& \text{Wasserstein Gradient:} \\
& \nabla_W F = -\text{div}(\rho\nabla\frac{\delta F}{\delta \rho})
\end{aligned}
```
### 3. Quantum-Classical Bridge
```math
\begin{aligned}
& \text{Quantum VFE:} \\
& F_Q = \text{Tr}[\rho(\ln\rho - \ln\sigma)] + \beta\text{Tr}[\rho H] \\
& \text{Classical Limit:} \\
& \lim_{\hbar \to 0} F_Q = F_{\text{classical}} \\
& \text{Quantum Policy:} \\
& |\psi_\pi\rangle = \sum_a \sqrt{P(a|\pi)}|a\rangle
\end{aligned}
```
## Advanced Implementation Frameworks
### 1. Multi-Scale Integration Engine
```python
class MultiScaleIntegrationEngine:
"""Integrates VFE computation across scales"""
def __init__(self):
self.quantum_computer = QuantumVFEComputer()
self.classical_computer = ClassicalVFEComputer()
self.path_integral_computer = PathIntegralComputer()
self.scale_bridge = ScaleBridgeComputer()
def compute_multi_scale_vfe(self,
quantum_state: QuantumState,
classical_state: ClassicalState,
path_config: PathConfiguration,
scale_params: ScaleParameters) -> Dict[str, float]:
"""Compute VFE across multiple scales"""
# Quantum scale computation
quantum_vfe = self.quantum_computer.compute(
quantum_state)
# Classical scale computation
classical_vfe = self.classical_computer.compute(
classical_state)
# Path integral computation
path_vfe = self.path_integral_computer.compute(
path_config)
# Bridge computations
quantum_classical_bridge = self.scale_bridge.bridge_quantum_classical(
quantum_vfe, classical_vfe)
classical_path_bridge = self.scale_bridge.bridge_classical_path(
classical_vfe, path_vfe)
return {
'quantum': quantum_vfe,
'classical': classical_vfe,
'path': path_vfe,
'q_c_bridge': quantum_classical_bridge,
'c_p_bridge': classical_path_bridge
}
```
### 2. Advanced Geometric Optimizer
```python
class GeometricOptimizer:
"""Geometric optimization for VFE"""
def __init__(self):
self.metric_computer = FisherRaoMetric()
self.connection_computer = LeviCivitaConnection()
self.parallel_transport = ParallelTransport()
def optimize_geometric(self,
initial_state: Distribution,
target_state: Distribution,
n_steps: int = 100) -> Distribution:
"""Optimize using geometric methods"""
current_state = initial_state
for _ in range(n_steps):
# Compute metric
metric = self.metric_computer.compute(
current_state)
# Compute connection
connection = self.connection_computer.compute(
current_state, metric)
# Compute geodesic
geodesic = self.compute_geodesic(
current_state, target_state, metric)
# Parallel transport update
current_state = self.parallel_transport.transport(
current_state, geodesic, connection)
return current_state
```
### 3. Stochastic Path Integral Computer
```python
class StochasticPathIntegralComputer:
"""Computes path integrals with stochastic dynamics"""
def __init__(self):
self.sde_solver = StochasticDifferential()
self.path_sampler = PathSampler()
self.action_computer = ActionComputer()
def compute_stochastic_path_integral(self,
initial_state: np.ndarray,
final_state: np.ndarray,
beta: float,
n_samples: int = 1000) -> float:
"""Compute path integral using stochastic sampling"""
paths = []
actions = []
for _ in range(n_samples):
# Sample path
path = self.path_sampler.sample(
initial_state, final_state)
# Compute stochastic action
action = self.action_computer.compute_stochastic(
path, beta)
paths.append(path)
actions.append(action)
# Compute path integral
Z = np.mean(np.exp(-np.array(actions)))
F = -np.log(Z) / beta
return F
```
## Advanced Theoretical Extensions
### 1. Relativistic VFE Framework
```math
\begin{aligned}
& \text{Covariant VFE:} \\
& F_{\text{cov}} = \int d^4x \sqrt{-g} \left(T^{\mu\nu}\nabla_\mu\nabla_\nu\ln\rho + V[\rho]\right) \\
& \text{Spacetime Action:} \\
& S_{\text{spacetime}} = \int d^4x \sqrt{-g}\mathcal{L}(\phi, \partial_\mu\phi) \\
& \text{Causal Structure:} \\
& \delta F_{\text{cov}}/\delta\rho = 0 \text{ on } J^+(x)
\end{aligned}
```
### 2. Quantum Field Theory Extension
```math
\begin{aligned}
& \text{Field VFE:} \\
& F_{\text{field}} = \int \mathcal{D}[\phi] \rho[\phi] \left(\ln\rho[\phi] - \ln P[\phi]\right) \\
& \text{Effective Action:} \\
& \Gamma[\phi] = -\ln \int \mathcal{D}[\chi] \exp(-S[\chi]) \\
& \text{Ward Identity:} \\
& \frac{\delta \Gamma}{\delta \phi} = \langle\frac{\delta S}{\delta \phi}\rangle
\end{aligned}
```
### 3. Topological Extensions
```math
\begin{aligned}
& \text{Topological VFE:} \\
& F_{\text{top}} = \oint_{\partial M} \omega + \int_M d\omega \\
& \text{Characteristic Classes:} \\
& c_1(F) = \frac{i}{2\pi}\text{tr}(F) \\
& \text{Index Theorem:} \\
& \text{index}(D) = \int_M \hat{A}(M)\text{ch}(E)
\end{aligned}
```
## Implementation Considerations
### 1. Advanced Numerical Methods
- Symplectic integration for Hamiltonian dynamics
- Adaptive Runge-Kutta methods
- Stochastic variational integrators
### 2. Parallel Computing Strategies
- GPU-accelerated path integral computation
- Distributed belief propagation
- Multi-scale parallel processing
### 3. Optimization Techniques
- Natural gradient methods
- Hamiltonian Monte Carlo
- Riemannian optimization
## References
- [[friston_2022]] - "The Free Energy Principle: A Unified Brain Theory?"
- [[parr_2022]] - "Active Inference: The Free Energy Principle in Mind, Brain, and Behavior"
- [[da_costa_2022]] - "The Mathematics of Active Inference"
- [[ramstead_2022]] - "Neural and cognitive architectures for active inference"
## See Also
- [[quantum_field_theory]]
- [[differential_geometry]]
- [[stochastic_processes]]
- [[information_geometry]]
- [[topological_quantum_field_theory]]