Relational inductive biases, deep learning, and graph networks - Cedars Hill Group Knowledge Base

# Executive Summary >However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one’s experiences—a hallmark of human intelligence from infancy—remains a formidable challenge for modern AI >[...] >We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them >[...] >the graph network—which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. >the principle of **combinatorial generalization**, that is, constructing new inferences, predictions, and behaviors from known building blocks # Compositional >That is, the world is compositional, or at least, we understand it in compositional terms. When learning, we either fit new knowledge into our existing structured representations, or adjust the structure itself to better accommodate (and make use of) the new and the old (Tenenbaum et al., 2006; Griffiths et al., 2010; Ullman et al., 2017). # Relational Reasoning >We define structure as the product of composing a set of known building blocks. “Structured representations” capture this composition (i.e., the arrangement of the elements) and “structured computations” operate over the elements and their composition as a whole. Relational reasoning, then, involves manipulating structured representations of entities and relations, using rules for how they can be composed. We use these terms to capture notions from cognitive science, theoretical computer science, and AI, as follows: > ◦ An **entity** is an element with attributes, such as a physical object with a size and mass. > ◦ A **relation** is a property between entities. Relations between two objects might include same size as, heavier than, and distance from. Relations can have attributes as well. The relation more than X times heavier than takes an attribute, X, which determines the relative weight threshold for the relation to be true vs. false. Relations can also be sensitive to the global context. For a stone and a feather, the relation falls with greater acceleration than depends on whether the context is in air vs. in a vacuum. Here we focus on pairwise relations between entities. > ◦ A **rule** is a function (like a non-binary logical predicate) that maps entities and relations to other entities and relations, such as a scale comparison like is entity X large? and is entity X heavier than entity Y?. Here we consider rules which take one or two arguments (unary and binary), and return a unary property value. >For example, [hidden Markov models](https://en.wikipedia.org/wiki/Hidden_Markov_model) constrain latent states to be conditionally independent of others given the state at the previous time step, and observations to be conditionally independent given the latent state at the current time step, which are well-matched to the relational structure of many real-world causal processes. # Inductive Biases >An inductive bias allows a learning algorithm to prioritize one solution (or interpretation) over another, independent of the observed data (Mitchell, 1980). In a Bayesian model, inductive biases are typically expressed through the choice and parameterization of the prior distribution (Griffiths et al., 2010). # Computations over sets and graphs >While the standard deep learning toolkit contains methods with various forms of relational inductive biases, there is no “default” deep learning component which operates on arbitrary relational structure. **We need models with explicit representations of entities and relations, and learning algorithms which find rules for computing their interactions, as well as ways of grounding them in data**. Importantly, entities in the world (such as objects and agents) do not have a natural order; rather, orderings can be defined by the properties of their relations # Graph Networks >Graphs, generally, are a representation which supports arbitrary (pairwise) relational structure, and computations over graphs afford a strong relational inductive bias beyond that which convolutional and recurrent layers can provide. ## Relational inductive biases in graph networks >Our GN framework imposes several strong relational inductive biases when used as components in a learning process. First, graphs can express arbitrary relationships among entities, which means the GN’s input determines how representations interact and are isolated, rather than those choices being determined by the fixed architecture. For example, the assumption that two entities have a relationship—and thus should interact—is expressed by an edge between the entities’ corresponding nodes. Similarly, the absence of an edge expresses the assumption that the nodes have no relationship and should not influence each other directly. >Second, graphs represent entities and their relations as sets, which are invariant to permutations. This means GNs are invariant to the order of these elements6 , which is often desirable. For example, the objects in a scene do not have a natural ordering (see Sec. 2.2). > >Third, a GN’s per-edge and per-node functions are reused across all edges and nodes, respectively. This means GNs automatically support a form of combinatorial generalization (see Section 5.1): because graphs are composed of edges, nodes, and global features, a single GN can operate on graphs of different sizes (numbers of edges and nodes) and shapes (edge connectivity). # Design principles for graph network architectures ## Attributes >The requirements of the problem will often determine what representations should be used for the attributes. For example, when the input data is an image, the attributes might be represented as tensors of image patches; however, when the input data is a text document, the attributes might be sequences of words corresponding to sentences. ### Edge Focused Uses edges as output, for example to make decisions about interactions among entities ### Node-Focused Uses nodes as output, for example to reason about physical systems ### Graph-Focused Uses globals as output, for example to predict the potential energy of a physical system >The nodes, edges, and global outputs can also be mixed-and-matched depending on the task. For example, Hamrick et al. (2018) used both the output edge and global attributes to compute a policy over actions. ## Graph Structure [[Artificial Intelligence to Read Structure#Two Approaches to AI]] >When defining how the input data will be represented as a graph, there are generally two scenarios: first, the input explicitly specifies the relational structure; and second, the relational structure must be inferred or assumed. These are not hard distinctions, but extremes along a continuum. **Multi-Agent Systems**: where the relational structure is not made explicit. Explore Further: [[Artificial Intelligence]] Tags: Your support for Cedars Hill Group is greatly appreciated <form action="https://www.paypal.com/donate" method="post" target="_top"> <input type="hidden" name="hosted_button_id" value="74PGN8ZXHQVHS" /> <input type="image" src="https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif" border="0" name="submit" title="PayPal - The safer, easier way to pay online!" alt="Donate with PayPal button" /> <img alt="" border="0" src="https://www.paypal.com/en_US/i/scr/pixel.gif" width="1" height="1" /> </form>