Risk Management: Toward Global Scalar Decryption
October 19, 2017
This is the slightly edited version of a talk that was presented at Pratt as a part of #AGI: Accelerate General Intellect, organized by the New Centre in New York, NY and for SOTHIS Gameboard Research in Philadelphia, PA, both 2016.
The computer scientist Scott Aaronson makes the remark that “In a sense that can be made precise, learning and cryptography are ‘dual’ problems.” [1] The basic idea is this: Whereas a learner attempts to locate patterns in an apparently random sequence, cryptography seeks to encode patterns in a random sequence such that they can not be easily located. Consider the simple example of a thriller where two bank-robbers meet at a noisy, crowded club to discuss their plans under the cover of all the background activity. Unbeknownst to them, an undercover detective sits in the shadows with a sophisticated listening device, amplifying the signal of the conversation and uncovering the plot amidst the noise.
Similarly, in 2013 one of the world’s most difficult to crack encryption methods, a 4096-bit RSA was broken by a sideways attack that recorded sounds made by the computer’s CPU as it decrypted a communication [2]. This ingenious method essentially sidestepped the sophisticated introduction of algorithmic randomness, by sampling the computer’s acoustic signature as it performed the keys that had been so carefully obscured. To understand how unlikely this seems, consider that to attempt a brute force an attack on a 4096-bit encryption, using all of the available atoms in the universe as processing power, would require more time than has passed since the big bang by an exponential magnitude over the power of 200. Buried in the annoying noise your computer makes as the processor is struggling away with a difficult problem may lie patterns that are decipherable–if not by the human ear alone.
This is probably a good a place to make a distinction between noise and randomness that has thus far been blurred by my presentation. When I refer to noise, I do not mean it in the mundane sense, like your roommate keeps humming that Taylor Swift song that you hate. Throughout this presentation, I will take a general view of stimulus as information, and define noise as apparently random information. I say apparently random because not all noise is meaningless, or put another way, truly random. From a subjective perspective, the whine that the processor makes in the previous example appears to be random, but may be easily decipherable with the proper tools. But what about the randomness of the 4096-bit encoding? This too, is actually only pseudo-random as it is based on the product of two large primes — though any direct attempt to unlock its pattern is a computational problem of many magnitudes more difficult than that of discerning the pattern in the processor’s acoustic signature.
Truly random noise may have a large amount of information, but no discernable syntax, or rules which would allow one to structure it. Complex information may be perceived as random, though one could potentially discern a pattern in it, if one had sufficient means to analyze the sequence. It is an interesting question for information theory, cryptography, computation, and physics if there really is anything such as true randomness, or all apparent randomness is merely pseudorandom, since true randomness can only be defined as an absence of a pattern.
The philosopher Ray Brassier has evinced a long standing interest in noise, precisely because of its challenge to coherent organization:
“What I consider to be interesting about noise is its dis-organizing potency: the incompressibility of a signal interfering with the redundancy in the structure of the receiver. Not transduction but schizduction:noise scrambles the capacity for self-organization.” [3]
Noise appears to us as sensory-overload, a disorganized stream of information for which we can find no corresponding conception to match the sensation. We are incapable of translating the information into a meaning beyond a negative apprehension, because we have no given pattern solving mechanism through which we can resolve the stimulus. As Brassier continues:
“I am very wary of ‘aesthetics’: the term is contaminated by notions of ‘experience’ that I find deeply problematic. I have no philosophy of art worth speaking of. This is not to dismiss art’s relevance for philosophy—far from it—but merely to express reservations about the kind of philosophical aestheticism which seems to want to holdup ‘aesthetic experience’ as a new sort of cognitive paradigm wherein the Modern (post-Cartesian) ‘rift’ between knowing and feeling would be overcome. In this regard, I would say that there can be no ‘aesthetics of noise’, because noise as I understand it would be the destitution of the aesthetic, specifically in its post-Kantian, transcendental register. Noise exacerbates the rift between knowing and feeling by splitting experience, forcing conception against sensation.” (Brassier)
In a more amenable frame, Peter Wolfendale discusses not only the aesthetic regime of art, but the semantic one, and attempts to reconcile the gap that Brassier emphasizes by recognizing that what is transmitted through the artwork is not just experience or concepts, but information, and that this information is received at different levels which both of these frameworks privilege in distinct ways. As he presents it:
“Put simply, the purpose of art is cognitive stimulation. To explain this properly requires a further distinction between cognitive process and cognitive product, or between the information processing subsystem that a given work activates and its results. In those examples favoured by the aesthetic model, the artwork aims to stimulate our non-discursive information processing capacities, elevating their exercise by testing their limits, disrupting them, or simply pushing them beyond their everyday use. Colour discrimination, visual pattern recognition, emotional intelligence, etc., are all subject to stimulation in their own ways, the point being not to produce any particular understanding of their object, but to exercise them for their own sake. In those examples favoured by the semantic model, the artwork aims to stimulate our discursive information processing capacities, inviting us to explore conceptual connections, resolve theoretical tensions, or indeed juxtapose interpretations, without demanding that we arrive at any particular conclusion.” [4]
Wolfendale, like Brassier, distinguishes between the conceptual and experiential, but the account he offers allows for the information that an artwork provides to be distributed across different processing systems and subsystems, while still being understood as contributing to an overall cognitive process.
Leading theories in contemporary neuroscience cast cognitive processes as a hierarchical bi-directional feedback system, positing semantic and representational systems on higher regulative levels, while sensual inputs such as motor-activity, proprioperception, light reception, and other basic data detection occur at lower-levels, and feed upward. The basic idea behind this architecture was first developed by Helmholtz in the mid-19th century, and is based upon a principle of the conservation of energy within closed systems. According to neuroscientist Karl Friston, the “free-energy” principle entails that
“any self organizing system that is at equilibrium with its environment must minimize its free energy . The principle is essentially a mathematical formulation of how adaptive systems (that is, biological agents, like animals or brains) resist a natural tendency to disorder.” [5]
The brain, when viewed as an elaborate Helmholtz machine, minimizes entropy through both top-down and bottom-up processes which can be effectively modeled via Bayesian inference. Bayesianism is an interpretation of probability that calculates the likeliness of a hypothesis with the likelihood of the given evidence. It is often referred as a form of backwards induction, since one begins with prior expectations of an outcome and updates them according to new evidence. In the hierarchical system advanced here, both higher and lower level processes are considered broadly Bayesian. As the philosopher Andy Clark writes:
“Neural representations, should the hierarchical predictive processing account prove correct, encode probability density distributions in the form of a probabilistic generative model, and the flow of inference respects Bayesian principles that balance prior expectations against new sensory evidence. This is a departure from traditional understandings of internal representation, and one whose full implications have yet to be understood. It means that the nervous system is fundamentally adapted to deal with uncertainty, noise, and ambiguity, and that it requires some (perhaps several) concrete means of internally representing uncertainty.” [6]
Put very simply, the brain encodes a prior representational hypothesis about what it expects to see and attempts to minimize surprise — imagine walking down a sidewalk where you are peripherally aware of traffic flowing to the side of you. Your movement over terrain and even the traffic tend to follow a fairly regular pattern, and while your senses are undoubtedly feeding up new information regarding events around you, the general representation you consciously appreciate as you steer towards your destination is fairly low cost, since you can roughly approximate the steady transformation of the journey and your surroundings. Suddenly, at the intersection ahead, a car comes careening out of the alleyway and crashes into oncoming traffic. Immediately a host of new data is fired up from below as sound, sight, and motor-reaction pick up the distal stimulus, registering an unlikely event. From a top down perspective, you have already formed a general picture of the street, the intersection, existing traffic, and its probable representation in the near future — now new evidence is integrated which forces you to adjust this picture — but your brain doesn’t throw out all prior expectations, it only integrates and reconciles the new evidence with the old, gelling a new scene in manner that allows for immediate recognition.
Modern image, audio, and video compression techniques, for example, operate along similar principles. Compression can transmit a significant amount of lossless information by encoding only the most ‘unexpected’ information in the image. In most images variation between any two pixels falls within a limited probabilistic range – one pixel and its neighbor likely share similar values – such that if that pixel contains information for a dark color, it is highly likely that its neighbor also represents a dark color. One need not encode the value for all of the pixels then, but only transmit the information regarding which pixels have the most variation–or “prediction error”–to a properly configured decoder which then reconstructs the image. The advantage is, of course, that the sending signal to the higher level “decoder” is extremely low on bandwidth. Similarly, it has been discovered that in the human retina, “Ganglion cells signal not the raw visual image but the departures from the predictable structure, under the assumption of spatial and temporal uniformity” [7].
To summarize where we’re at so far – we can view the brain as a Helmholtz machine that uses a variety of techniques to optimize information transfer. Bayesian inference networks weight expected inputs against new information. Expected priors are generated by the invariances of the scale transformations between the various network levels themselves, building a hierarchical network that is sensitive to the introduction of deviations, while discarding excess information. It therefore seeks to minimize noise from lower levels.
Noise without a discernable pattern is merely uniform random distribution — on the flip side, a pattern suggests a deviation from the mean noise. Hierarchical models operate by comparing two different noise channels and locating this deviation, disregarding most other information. As the free-energy principle suggests, this optimizes energy consumption while discarding entropy. In physics, this is described as renormalization, or the maintenance of invariances across scale transformations. Interestingly, Deep learning algorithms operate according to similar principles as the hierarchical models we have discussed, setting up multiple layers in a network that filter information across distinct scales and ignoring irrelevant information.
Now, I want to turn back to a point of Brassier’s that was presented early on: [Noise is] “the incompressibility of a signal interfering with the redundancy in the structure of the receiver.” [8]We have been discussing noise in general terms as random information, but in the context where Brassier makes these statements he is also referring to noise music, where amplification cuts across multiple layers of the subject’s network, refusing to be ignored. Whereas the brain actively works to filter out noise, Noise music spikes the inputs, introducing an incompressible signal that can not be renormalized according to any lower-level expectations. Wolfendale might, like Brassier, recognize noise as disruptive of these pre-conceptual aesthetic subsystems, even as it takes form as content on the semantic level. However Brassier may be correct to argue that any further decomposition of noise requires a decentering of our experiential capabilities. Beyond this, it may even require an externalization of our conceptual capacities — for on the semantic level what form could an apparently random pattern take, other than a symptomatic one? That is to say, without a recognizable pattern induced by the senses, any subjective conceptual association lacks orientation, and any positive identification must itself be an internally generated error.
In a 2011 experiment operating under Bayesian brain assumptions, participants were asked to listen to two types of music – one that made them feel happy and another that made them feel sad. Choices were left up to the subjects. In the experiment subjects were shown a series of faces embedded in noise and asked to detect these faces while listening to music. The participants found it much easier to detect faces which were congruent with their own mood (happy or sad). As a control, they were also shown images of random noise with no faces. In these instances the researchers noted a significant number of false identifications. As they put it:
“As illusory percepts are believed to reflect the content of internal representations that are employed by the brain during top-down processing of visual input, we conclude that top-down modulation of visual processing is not purely predictive in nature: mood, in this case manipulated by music, may also directly alter the way we perceive the world.” [9]
Cross channels from a number of lower-layer inputs can influence the overall top-down recognition of stimulus, even when they have no direct causal link to the percept under consideration. Why the false positives, then? Prior expectations of perceptual levels are affected by evolutionary trajectories which favor general environmental conditions in a way that contribute to the continued survival of the organism, even if they are not truly indicative of the environment. An ancient hunter on the grasslands wary of his surroundings who expects a tiger in the grass may see a tiger even if none is there. The hunter who runs away when there is a tiger lives to reproduce, as does the hunter who sees a tiger, even when there is no tiger present.
To consider a less exciting example, the convex/concave illusion presents two circles: one shaded 50% dark on the top, another shaded 50% dark on the bottom. We automatically assume that the circle shaded on the bottom is convex, while the circle shaded on the top is concave. This is because, having evolved in an environment with overhead light — the sun — when we see these circles we interpret the volume of the objects based on this convention. However, if we consider that lighting of the objects is coming from below, the impression reverses. Environmental invariants influence cost-effective probabilistic weightings. On the other hand, top-down expectations are shaped by experience and evolve dynamically with local context, generating neural nets weighted with the subjective history, incorporating local invariance in tension with evolutionary stabilities, enabling survival even under uncertain environmental conditions.
Now, at this time of incredible and unprecedented environmental flux that we have dubbed the anthropocene, I want to think through the challenge of Brassier’s amplified noise music in different terms: We might say that noise produces an alienation effect. As the system’s theorist R. Felix Geyer indicates, modern alienation may be rendered as a complexity differential: the interior capacity of the individual or subject to process information is superseded by the exterior complexity of the information to be processed [10]. Amplified noise shifts the background remnants that our subsystems filter out of scale and present it front and center. All of the ambiguities that our systems are evolved to reduce — because they were not evolutionarily convenient — are presented as a big uncompressed problem for which we have no internal resources to decode. As I’ve suggested previously, complex information may be perceived as random, though one could potentially discern a pattern in it, if one had sufficient means to analyze the sequence.
Humans have evolved to deal with certain medium sized objects in a fairly local environment. Language has helped to extend our overall grasp of levels above and below, but honestly, we’re limited in our computational power to deal with those scales either below or above us, and it wasn’t very evolutionarily efficient to discern objects in the environment that wouldn’t have some short term effect on our continued viability. We’ve basically benefited from the unintended consequences of a robust adaptation.
Speculative Postscript
I have peppered my presentation here with different examples of Bayesian systems drawn from recent technologies, including deep learning and image compression. We might add to this list algorithmic trading procedures, statistical analysis, environmental simulations, amongst a host of others. These techniques, while useful in some respects fail miserably in others — our current economic techniques, for example, do not account for significant feedback and have a limited range of inputs and action orientation considering the scale of phenomena they are expected account for — it is not surprising that we therefore shift through cycles of boom and bust as the system constantly miscalculates its own homeostasis. Suhail Malik characterizes our contemporary economic order as a risk society, based on mining the future [11] — what is interesting, however, is that our brains are essentially risk calculators, employing predictive modelling to orient our own action. We are, in some measure then, risk beings. But, our senses, however flawed they may be in respect of global concerns, are capable of transmitting a significant portion of our environment in a manner that allows for effective steering and responsiveness. The subsystems that filter into our reasoning systems discard a lot of information that is inconsequential to our continued survival, but the systems we’re building as prosthesis to our current limitations might have the ability to glue distinct levels of information processing onto our own. Such a system, to be relatively veridical to its environment, requires multiple tiers of information processing systems. Furthermore, any system which is well adapted to its environment must still be somewhat energy efficient if it is not going to exhaust that environment. It must be able to filter noise from the relevant objects on its own scale.
We need to better understand and incorporate the robust aspects of human cognition, and deploy aspects of brain architecture on a global scale. What is required is a massive intensification of intelligence, not a retreat from the tentative steps that have been taken. Nor do the arts benefit from denying the new paradigms presented by contemporary neuroscience and clinging to 18th aesthetic models which favor intuition and experience. If art is to contribute to this future it must think alongside these models to better mediate and bridge the human and inhuman. I will leave you with one final comment: “Decrypt everything.”
–
1.Scott Aaronson, “Why Philosophers Should Care About Computational Complexity“.
2. Sebastion Anthony, “Researchers crack the world’s toughest encryption by listening to the tiny sounds made by your computer’s CPU“, Extreme Tech, December 18, 2013.
3. Ray Brassier and Bram Iverin, “Against an Aesthetics of Noise“, nY-web, May 10, 2009.
4. Peter Wolfendale, “The Artist’s Brain at Work“.
5. Karl Friston, “The Free Energy Principle: A Unified Theory of the Brain?” Nature Reviews Neuroscience, January 13, 2010.
6. Andy Clark, “Whatever Next? Predictive brains,situated agents, and the future of cognitive science“, Behavioral and Brain Sciences, 2013.
7. Hosaya, et. al., “Dynamic predictive coding by theretina“, Nature, July 2005.
8. Ray Brassier and Bram Iverin, 2009.
9. Jacob Jolij and Maaike Meurs, “Music alters visual perception“, Plos One, April 21, 2011.
10. R. Felix Geyer, “Individual Alienation and Information Processing: a Systems Theoretical Conceptualization“, 1976.
11. Suhail Malik, “The Ontology of Finance: Price, Power, and the Arkhéderivative“, Collapse VIII, December 2014.