Common Introduction to all 6 Posts
History and Context
These blog posts are an extension of my efforts to convince evaluators to shift their focus from complex systems to specific behaviors of complex systems. We need to make this switch because there is no practical way to apply the notion of a “complex system” to decisions about program models, metrics, or methodology. But we can make practical decisions about models, metrics, and methodology if we attend to the things that complex systems do. My current favorite list of complex system behavior that evaluators should attend to is:
Complexity behavior  Posting date 
· Emergence  up 
· Power law distributions  up 
· Network effects and fractals  up 
· Unpredictable outcome chains  up 
· Consequence of small changes  Oct. 12 
· Joint optimization of uncorrelated outcomes  Oct. 19 
For a history of my activity on this subject see: PowerPoint presentations: 1, 2, and 3; fifteen minute AEA “Coffee Break” videos 4, 5, and 6; long comprehensive video: 7.
Since I began thinking of complexity and evaluation in this way I have been uncomfortable with the idea of just having a list of seemingly unconnected items. I have also been unhappy because presentations and lectures are not good vehicles for developing lines of reasoning. I wrote these posts to address these dissatisfactions.
From my reading in complexity I have identified four themes that seem relevant for evaluation.
 Pattern
 Predictability
 How change happens
 Adaptive and evolutionary behavior
Others may pick out different themes, but these are the ones that work for me. Boundaries among these themes are not clean, and connections among them abound. But treating them separately works well enough for me, at least for right now.
Figure 1 is a visual depiction of my approach to this subject.
Figure 1: Complex Behaviors and Complexity Themes. 
 The black rectangles on the left depict a scenario that pairs a welldefined program with a welldefined evaluation, resulting in a clear understanding of program outcomes. I respect evaluation like this. It yields good information, and there are compelling reasons working this way. (For reasons why I believe this, see 1 and 2.)
 The blue region is there to indicate that no matter how clear cut the program and the evaluation; it is also true that both the program and the evaluation are embedded in a web of entities (programs, policies, culture, regulation, legislation, etc.) that interact with our program in unknown and often unknowable ways.
 The green region depicts what happens over time. The program may be intact, but the contextual web has evolved in unknown and often unknowable ways. Such are the ways of complex systems.
 Recognizing that we have a complex system, however, is not amenable to developing program theory, formulating methodology, or analyzing and interpreting data. For that, we need to focus on the behaviors of complex systems, as depicted in the red text in the table. Note that the complex behaviors form the rows of a table. The columns show the complexity themes. The Xs in the cells show which themes relate to which complexity behaviors.
Unspecifiable Outcome Chains
Pattern

Predictability

How change happens  Adaptive evolutionary behavior  
Emergence  
Power law distributions  
Network effects and fractals  
Unspecifiable outcome chains  X  X  
Consequence of small changes  
Joint optimization of uncorrelated outcomes 
People are enamored by the “butterfly effect”, but I hate it. It is beyond me why evaluators are so drawn to the idea of instability. In my world you can hit programs over the head with data as hard as you can, and they still do not change. My problem is too must stability, not too little. And yet, the notion of sensitive dependence has its place in evaluation. That place is not in uncertainty about what will happen, but uncertainty about the order in which things will happen. I don’t know how frequent a problem this is in evaluation, but I’m pretty sure it exists. I am very sure that evaluators would do well to consider the possibility when they develop program theory.
In what follows I am going to adopt a common view of butterflies and instability. It’s the one that opens Wikipedia’s entry on the butterfly effect: “In chaos theory, the butterfly effect is the sensitive dependence on initial conditions in which a small change in one state of a deterministic nonlinear system can result in large differences in a later state.” Needless to say this is a very simplistic approach to a very complicated and controversial subject. Read the rest of the Wikipedia entry to get a sense of the issues involved. If you really want to get into it, go to: Chaos.
The reason for the difficulty in understanding outcome order is that we can be too confident in our estimations of what fits where in outcome chains. We think the sequence is invariant, which most often it probably is. I am convinced though, that there are times when small perturbations can affect the order. Stated differently, the sequence of outcomes is subject to small random fluctuations in the environment.
I’ll illustrate with a hypothetical example. A friend of mine who does a lot of educational evaluation assures me that it makes some sense. The program in question is designed to improve teachers’ classroom management skills. Figure 2 compares two versions of the program theory. The top of the figure takes the form of a linear sequence. It’s sophisticated in the way it mixes unambiguous relationships and uncertain relationships. The dashed arrows indicate unambiguous relationships: For instance, classroom management leads to job satisfaction, which in turn leads to less tension between teachers and principles. Solid black arrows show ambiguous relationships. For instance, “student satisfaction” is an input toan unspecified collection of other intermediate outcomes.
The bottom form of the model acknowledges limitations in the program theory. It depicts a situation in which better classroom management makes itself felt in a cloud of outcomes that affect each other in elaborate ways, both directly via 1:1 and 1:many relationships, and also via proximate and distal feedback loops. Also, note the two different network possibilities – red solid, and blue dashed. I did that to emphasize that any number of relationships are possible. It would make the picture too complicated, but it is also the case that the network of relationships will be different in each setting where the classroom management program is implemented.
Figure 2: Traditional and Complex Theories of Change 
What will the relationships be in any particular setting? That is an unanswerable question because too many factors will be operating to specify the conditions. All we know is that better classroom management leads to any number of student performance outcomes, which in turn will lead to higher test scores.
If there is so much confusion about intermediate outcomes, why might we be able to reliably expect that the classroom management program will result in higher test scores? Complexity provides two ways to think about this: 1) emergence, and 2) attractor space.
Emergence: A good way to explain emergence is to start with a counter example. Think of a car. Of course the car is more than the sum of its parts. But it is also true that the unique function of each part, and its contribution to the car, can be explained. If someone asked me what a “cylinder” is, I could describe it. I could describe what it does. When I got finished, you would know how the part “cylinder” contributes to the system called a “car”.
In contrast, think about trying to explain a traffic jam only in terms of the movement of each individual car. The jam moves in the opposite direction to the cars of which it is composed. The jam grows at the back as cars slow down, and the front recedes as cars peel off. Looked at as a whole, the traffic jam is clearly something qualitatively different from the movement of any care in the jam. (NetLogo has good one and two lane simulations that are worth looking at.) In the “classroom management” case, we might consider “better test scores” as an emergent outcome – one that cannot be explained in terms of the particulars of any of its parts.
Attractor space: There are two ways to think about “attractors” in complexity. The most formal is a mathematical formulation concerning the evolution of a dynamical system over time. As Wikipedia puts it: “In the mathematical field of dynamical systems, an “attractor” is a set of numerical values toward which a system tends to evolve, for a wide variety of starting conditions of the system. System values that get close enough to the attractor values remain close even if slightly disturbed.”
However, there is a more metaphorical, but still useful, way to think about this. Namely, that an attractor is a “space” that defines how something will move through it. There may be many paths within the attractor, but depending on its “shape”, many paths through it will lead to the same place. Marry this to the open systems notion of “equifinality” and it’s not hard to think in terms of a set of causal relationships among a defined set of variables that will lead to the same outcome. In theory there could be an infinite number of elements and paths that would lead to higher test scores, but that does not matter. What matters is that a particular set of outcomes are meaningful intermediate outcomes for a particular program, that it makes sense to measure those outcomes, and that many different combinations of those intermediate outcomes can be relied upon to produce better test scores.
While I am not sure which way to think about the bottom scenario in Figure 2, I do know that there is an important difference between the “emergence” and “attractor” perspective. With emergence, the specific intermediate outcomes do not matter very much. Which ones are manifest and which ones are not is irrelevant to the emergent result. That may be elegant in its way, but it is not all that satisfying to program funders. After all, they do want to know what intermediate outcomes were produced. The attractor way of looking at it does focus attention on which of those intermediate outcomes were manifest, and in what order. It may not be possible to assure the funders that the same outcomes and order will appear the next time, but it is possible to give them some pretty good understanding of what happened. The logic of generality and external validity notwithstanding, knowing what happened in one case can be awfully useful for planning the future.