NLP and Social Theory

Copyright Richard Bolstad 2020

What This Essay Will Not Attempt: Defining NLP

This is not intended to be a comprehensive overview of either NLP or Social Theory. It is intended as a place for us as NLP Practitioners to continue the process of considering NLP within the broader field of social theory, in order to better understand some of NLP’s implicit biases and to deepen some of its insights. This process was perhaps initiated explicitly by Lisa Wake in her book “Neurolinguistic Programming: A Postmodern Perspective” (2008) which reviewed NLP in the context of “postmodernist” social theory. It is implicit in the book “The Origins of Neuro Linguistic Programming” edited by John Grinder and Frank Pucelik (2012) and much of the early NLP canon. It is explicit in much of Michael Hall’s work.

Robert Dilts in 1976 defined the new field of NLP thus: “Neuro-linguistic programming (NLP) was developed as a means to explore and analyze complex human behavior. It is a cybernetic approach, as opposed to causal, linear, or statistical methods, utilizing the phenomena of language and perception to look for the “differences that make a difference” in the way that human beings organize their experience. Specifically: (a) distinctions people make about their internal experience (i.e., mental images, feelings, internal dialogue, etc.); (b) strategies people use to make sense of their experience; (c) strategies people use to access and communicate stored perceptual information; and (d) how these distinctions and patterns can be integrated to understand, promote, and contribute to the human processes of change, choice, and learning. Being a cybernetic model, the structure of NLP is generative: it has the ability to predict to a certain degree and find alternatives to specific behaviors as well as the ability to describe those already elicited. NLP has also what might be called an “open” structure in that the structure itself may change or expand in accordance with its own findings.”

It is to be noted that John Grinder comments on Robert Dilts work a few pages after this definition is quoted (2012), accusing him of an “attempt to redefine the field of NLP in a manner that differs significantly from the original intentions of the co-creators of NLP.” I note that Grinder did not disagree with the definition in written form in 1976 however. I also note that this comment excludes Dilts from the category of “co-creators” of NLP, an exclusion that Dilts might be surprised by. Since I am not attempting to define NLP here, I will continue, however. In his quote, Dilts defines NLP as an attempt to analyze complex human behavior that is a) cybernetic rather than linear, and b) generative and open rather than closed. I will briefly define those words in other sections below. For now, let us note that they, like Lisa Wake’s term “postmodern” refer to social theories that were current in the university milieu at the time he wrote.

For Robert Dilts, as for John Grinder, the fact that NLP first emerged in the field of Psychotherapy is merely happenstance. The developers did not so much plan to recreate the psychotherapy of Virginia Satir, Fritz Perls and Milton Erickson, as to pursue far more intellectually divergent paths of their own. Something of the flavor of early NLP experimentation and the serendipity involved in the development of the field is given by John Grinder in his story of how NLP emerged from the first “Meta Model” of Virginia Satir’s questioning style, and his comment that “I was determined to explore these new skill sets and determine how they might well serve the same intentions that had I carried through my Special Forces experiences as well as the radical left-wing politics and political actions that had characterized my public actions during graduate school (University of California at San Diego) and first years as a professor at UCSC. In this quite precise sense, I viewed the Meta Model as a bullshit detector.”

NLP, then, emerged as both a methodology for analyzing individual human achievement, and as a method of analyzing wider social interactions. It may not be appropriately included in the contet of psychotherapy, but it certainly exists within the context of social theory.

What This Essay Will Not Attempt: Defining Social Theory

Social Theory has been the subject of University level papers which I studied in degree programs in both of the apparently unrelated disciplines of Nursing and Archaeology, which gives you a sense of how wide the accepted applications of these theories currently are. Like NLP, social theories are “meta-disciplines”: models that have uses in a wide variety of other fields, and which sometimes inform those fields without the conscious intention of many who are practicing in them.

In everyday speech, we frequently dismiss theory as “mere” abstraction. Both social theorists and NLP co-developers disagree. Social theorist Hans Joas argues “Theory is as necessary as it is unavoidable. Without it, it would be impossible to learn or to act in consistent fashion; without generalizations and abstractions, the world would exist for us only as a chaotic patchwork of discrete, disconnected experiences and sensory impressions. Of course, in everyday life we do not speak of ‘theories’; we use them with no awareness that we are doing so.” (Joas, H. in Joas, H. and Knöbl, W. eds “Social Theory” p. 5).

Thomas S. Kuhn’s “The Structure of Scientific Revolutions” (1962) is quoted in both this book on social theory and in John Grinder’s Introduction to “The Origins of Neuro Linguistic Programming” (2012). Kuhn explains that once a new “paradigm” or meta-theory has entered scientific thinking, then scientists find that even the “data” itself changes as a result of the theoretical glasses through which they now view it. Kuhn notes “The very ease and rapidity with which astronomers saw new things when looking at old objects with old instruments may make us wish to say that, after Copernicus, astronomers lived in a different universe.” This is true in social theory as much as in physics.

Social Theories are theories that accept as their premise that human beings are social — they exist in interaction. New social theory paradigms allow us to perceive the world of social relationships in new ways and indeed to discover new data.

NLP, John Grinder suggests, is a paradigm shift such as those referred to by Kuhn. However, NLP did not “fall from the sky” ready formed. It emerged as a result of other shifts in perception, that the co-developers and their friends refer to (such as cybernetics). John Grinder points out that the notion of neuro-linguistics itself was at least half a century older than NLP. The name occurs in a preface to Alfred Korzybski’s book “Science and Sanity”, published in the 1930s. And NLP could not of course predict the other paradigm shifts that would alter the world of psychotherapy, and social science in general, after its inception (such as postmodernism). In this essay, I plan to look at some of the models that were consciously incorporated into NLP, and a couple of models that emerged after it.

The Presuppositions of NLP: Instead of a Social Theory

Apart from the ad-hoc use of terms such as cybernetics, without any explanations often, there is little evidence of explicit and coherent social theory in early NLP works. The nearest approach is the list of principles that are termed the “Presuppositions of NLP”, usually attributed to Leslie Cameron Bandler, and often criticized by both Richard Bandler and John Grinder. Here is Robert Dilts’ nine point version of these from “Applications of NLP in Family Therapy and Interpersonal Negotiation” (1980). This is a list of aphorisms distilled from the work of Alfred Korzybski, Gregory Bateson, and to some extent from Virginia Satir’s belief system. The list also implies other social philosophical stances such as Pragmatism, as we shall see. In his ongoing attempt to establish an overview of NLP, Michael Hall (2011) has urged us to read the original material to understand the things the developers of NLP took for granted. This article is also an attempt to find the philosophy behind these “truisms”.

1. The map is not the territory.
2. Mind and body are part of the same system and affect each other.
3. Individual skills are a function of the development and sequencing of representational systems.
4. The meaning of any communication is the response it elicits, regardless of the intent of the communicator.
5. Human beings are capable of one-trial learning.
6. Individuals have all the resources they need to achieve their desired outcomes.
7. Behavior is geared toward adaptation: People make the best choices available to them at any point in time, underlying every behavior is a positive intent.
8. There is no substitute in communications for clean, active, open, sensory channels to know what response you are eliciting at any moment in time.
9. The element in a system that has the most flexibility will be the controlling, or catalytic, element in that system.

NLP as non-Psychology

Another way that NLP approached social theory was to distance itself from other social theories and fields such as mainstream Psychology. John Grinder correctly emphasises that “NLP and psychology are easily distinguished by two variables: the domain of study (average performance in psychology and excellence in performance in NLP) and the methodology applied. Given this distinction as the object of study, it is not difficult to conclude that the tools appropriate for one of these, academic psychology, form a statistical approach in which the aggregation of the performances of the subjects in isolated and typically artificial contexts (experiments) occur and a host of methods for determining the average — from the simple division of the cumulative performance by the total number of subjects yields the average.”

A good example: Psychology research shows that on average humans lean forward when thinking about the future, and backwards when thinking of the past. “On average, test subjects thinking of the past leaned back by about 0.07 inches (1.5 to 2 millimeters), whereas those thinking of the future moved about 0.1 inches (3 mm) in the opposite direction.” (University of Aberdeen in Scotland researcher Lynden Miles). In NLP, of course, we are interested in individual responses and eliciting a specific “time line” for each individual. Native speakers of languages such as New Zealand Maori conceive of the future as behind them and the past in front, so would certainly vary from this “average” result.

Pragmatism (William James)

I have noted above that one of the philosophical theories implicit in early NLP is Pragmatism, a tradition that began in the United States around 1870. Its origins are often attributed to the philosophers Charles Sanders Peirce, William James, and John Dewey. “Pragmatic” means focused on usefulness. Pragmatism considers words and thought as tools and instruments for prediction, problem solving, and action, and rejects the idea that the function of thought is to describe, represent, or mirror reality. Basically, it says that we cannot really know what is true, but we can know what is functional: what works.

William James (1842-1910), who predicted much of what we now call NLP, wrote the first book actually titled Psychology (ironically considering how Grinder distanced NLP from that field). In his book, James notes that the individual use of sensory systems determines our results: “In some individuals the habitual “thought stuff”, if one may so call it, is visual; in others it is auditory, articulatory [to use an NLP term, auditory digital], or motor [kinesthetic, in NLP terms]; in most, perhaps, it is evenly mixed.” (James, 1950, Volume 2, p58). He notes that accessing these systems is accompanied by brief eye movements, and he lists a set of subdomains within each sensory system, identical to what NLP now calls “submodalities”.

Graduating as a medical doctor at the age of 27, James found himself depressed and anguished about the pointlessness of his life, which seemed predetermined and empty. In 1870, he made the philosophical breakthrough that enabled him to pull himself out of his depression. This was the realisation that different beliefs have different consequences. James had been puzzling for some time about whether human beings had a genuine free will, or whether their actions were the deterministic results of genetic and environmental influences. He now realised that such questions were insoluble, and that the more important issue was which beliefs have the most useful consequences for the believer. James discovered that the belief in determinism made him passive and impotent; the belief in free will allowed him to consider alternatives, to act and to plan. Describing the brain as “an instrument of possibilities” (Hunt, 1993, p149), he decided, “At any rate, I will assume for the present -until next year- that it is no illusion. My first act of free will shall be to believe in free will. I will go a step further with my will, not only act with it, but believe as well; believe in my individual reality and creative power.” This is the essence of Pragmatism, and — many would say — the essence of NLP.

Cybernetics and Functionalism (Gregory Bateson)

John Grinder’s friendship with and intellectual debt to Gregory Bateson is well acknowledged. Bateson taught “Steps to an Ecology of Mind” as a course at UCLA, at the same time that Grinder taught the rather anti-establishment “Political Economy of the United States” there. It is from here, more than anywhere else, that the understanding of systems theory (cybernetics) entered NLP. Bateson’s interest in systems theory forms a thread running through his work. He was one of the original members of the core group of the Macy conferences in Cybernetics (1941-1960), and the later set on Group Processes (1954-1960), where he represented the social and behavioral sciences. He was interested in the relationship of these fields to epistemology (the study of how we know what is real).

So what is systems theory and what is cybernetics? Systems theory first emerged from the work of Ludwig von Bertelanffy in the 1930s and was named “Cybernetics” by Norbert Wiener in 1948 (the term comes from the Greek word for a steersman on a boat and refers to the way a steersman adjusts his/her actions continuously as a response to feedback). Gregory Bateson used the analogy of a man cutting down a tree with an axe (Bateson, 1991, p 164). He points out that in order to swing the axe, the man needs to pay attention to where the last cut was. The cut, it could be said, causes him to swing in a certain place. And each cut could also be said to result from the specific properties of the axe; how heavy it feels, and how well balanced. So the axe, it might be claimed, controls the cut, which controls the man. Actually, of course, Bateson is claiming that the tree-cutting is a system, and that cause and effect descriptions of it may be useful for communication, but have little use scientifically. Even the arbitrary divisions between man, axe and tree merely simplify in ways which suit our communication style.

In his book Steps to an Ecology of Mind, Bateson applied cybernetics to the field of ecological anthropology and the concept of homeostasis. He saw the world as a series of systems containing those of individuals, societies and ecosystems. Each of these systems has internal adjustment and adaption systems which depend upon feedback loops to create balance or “homeostasis”. Bateson saw these natural ecological systems as innately “good”. This implication of systems theory is called functionalism.

Functionalism, in social sciences, is a theory based on the premise that all aspects of a society–institutions, roles, norms, etc.–serve a purpose or “function” and that all are indispensable for the long-term survival of the society. The society cannot exist when they are removed arbitrarily, and they cannot exist in the same way outside the system. This social theory first gained prominence in the works of 19th-century sociologists, particularly those who viewed societies as organisms (as indeed Bateson also did). The French sociologist Émile Durkheim argued that it was necessary to understand the “needs” of the social organism to which social phenomena correspond. Function can refer to the interrelationships of parts within a system, the adaptive intention of a phenomenon, or its observable consequences. The function of a baby crying, then, can be understood as an intervention by part of the family system to get energy focused on it by another part (the caregiver), as an adaption by which the family ultimately survives, or as an action which has the consequence of ensuring that the baby is appropriately fed. If the baby’s crying is the most flexible element in the family system (for example it can occur at any time of the day and night, and cannot be delayed or ignored), then it becomes (by definition) the element that “controls” the functioning of that system (as any parent discovers).

This all makes sense of several of the “Presuppositions of NLP”. However, functionalism also implies in some sense that homeostasis is intrinsically good. It ignores the reality that family systems change over time in ways that are more or less useful to their members, precisely because homeostasis can be disrupted. It implies that the aim of psychotherapy in a family system is to help members adjust, and the aim of social analysis is to help individuals adjust to social “realities”. Nazi Germany was a system, and to disrupt it’s persecution of the Jews would have disturbed its homeostasis in many way. Rather than anguish about how to support the positive intention of the holocaust, however, most of us can accept that stopping it is more important, and the homeostasis of the Nazi system can “go to hell”. If the system collapses, so be it. This is why postmodern theory (see below) has drastically modified classical systems theory.

Transformational Grammar (Noam Chomsky)

Since John Grinder was a professor of Linguistics, a field arguably created by Noam Chomsky, Chomsky’s work was almost as significant as Batesons in early NLP. Richard Bandler and John Grinder used it to analyse the linguistic patterns of Virginia Satir and Milton Erickson. Noam Chomsky’s 1965 book “Aspects of the Theory of Syntax” developed the idea that each sentence in a language has two levels of representation: a deep structure and a surface structure. The surface structure is what the person says and the deep structure is the more complete linguistic understanding that they hold in their unconscious mind. The surface structure is generated (when the person wants to speak) by a series of transformations from the deep structure. The sentences “The bear chased the lion” and “The lion was chased by the bear” thus have the same deep structure, where lion is object, and bear is subject, and the action is chasing. Richard Bandler and John Grinder held that their meta model of Virginia Satir’s questioning enabled a psychotherapist to recover the deeper structure behind a client’s surface comments, helping the client contact this inner experience more fully and change their communication with both themselves and others as a result.

Chomsky himself did not claim that his “transformational grammar” was a model for the processes through which the human mind constructs and understands sentences; merely that it suggested what core knowledge underlies the human ability to speak and understand (verbs, subjects, objects etc). He held that because most of that knowledge is innate, a baby can have a large body of prior knowledge about the structure of language in general and so need to learn only the idiosyncratic features of the language to which it is exposed. As we shall see below, John Grinder found that George Miller’s neuroscientific research did indeed suggest that the “transformational processes” identified by Chomsky may be close to the actual structure of language areas in the brain. Chomsky, meanwhile, discarded the whole idea in the 1990s holding that it was an unnecessary abstraction.

Nonetheless, Chomsky’s insights into deep structure enabled him to more effectively challenge others’ thinking in his own political activism (a field in which Chomsky is as famous as he is in linguistics). Consider this example of “Metamodel” questioning in political discussion:”Man: Dr. Chomsky, I just want to ask a question on this topic: Daniel Ortega [Nicaraguan President, Sandinista Party] was in power for how long, a decade?
Yes
Man: And yet he lost the election.
Why “And yet”?
Man: Well, he had control of that country for ten years.
What does that mean, “He had control of it”?
Man: He controlled the press.
He did not. In fact, Nicaragua is the only country I know of in history that allowed a major opposition press [La Prensa] to operate while it was being attacked — a press which was calling for the overthrow of the government by violence….” (Chomsky, 2002, p 109)

”I viewed the Meta Model as a bullshit detector…. The initial avowed goal was the extraction of patterning from the geniuses of late 20th century agents of change in order to create choice for individuals and small groups seeking liberation from their self-imposed limitations. But the track I was running behind this activity was a challenge to the processes by which ideologies were promoted in the world and the congruity (or lack thereof) of the people promoting these ideologies.” (John Grinder in Grinder and Pucelik, 2012, p.185) This bullshit detection system (as Grinder called it) provides an important counterbalance to Bateson’s functionalist cybernetics.

General Semantics (Alfred Korzybski)

General Semantics is a self-improvement and even (some would say) a therapy program begun in the 1920s. Polish-American originator Alfred Korzybski (1879–1950) coined the term General Semantics in 1933 with the publication of “Science and Sanity: An Introduction to Non-Aristotelian Systems and General Semantics”, and three years later he later used the term Neuro-Linguistic to refer to the same set of distinctions.

Korzybski suggested that our language shapes our experience of the world in ways that are not always helpful. He said that our internal “maps” or theories about the world are never the same as the territory they describe. He takes the example (Korzybski, 1994, p 750) of a map including the cities of Paris, Dresden and Warsaw. He notes that in a useful map, Dresden is given as between Paris and Warsaw, which parallels the relationship that occurs when you drive from one place to the other across the actual territory. At best, a map can contain similar relationships to the territory. Korzybski (1994, p 750-752) points out that words share some characteristics with maps. Words are never equivalent to the actual things or events in the world. The actual events and “things” in the world are part of the territory, and cannot be said to be the same as some map. For example, when I say “This copy of the book Transforming NLP Training is an interesting collection of essays”, the words “This copy of the book Transforming NLP Training” refer to an actual thing, and actual things are not “speakable” (ie they cannot be put into words. An actual thing cannot be the same as a map, such as “an interesting collection of essays”. For that reason, he says “The use of the “is” of identity, as applied to objective, un-speakable levels, appears invariably structurally false to facts and must be entirely abandoned. Whatever we might say a happening “is”, it is not.”

One thing Korzybski would never have agreed with is the idea that all maps are equally valid, or equally useful. He is pointing out that a map can be assessed by its usefulness. Useful maps, he says, reveal their closer correlation to the actual territory by the way that using them enable us to act successfully in that territory. Similarly, NLP was never designed to simply accept all maps as equally valid: it was designed to identify which maps are most useful for achieving particular outcomes.

Korzybski himself states it much more clearly. He says (1994, p. 750) “If we consider an actual territory (a) say, Paris, Dresden, Warsaw, and build up a map (b) in which the order of these cities would be represented as Dresden, Paris, Warsaw; to travel by such a map would be misguiding, wasteful of effort …. In case of emergencies, it might be seriously harmful …. We could say that such a map was ‘not true’ … or that the map had a structure not similar to the territory …. We should notice that: A) A map may have a structure similar or dissimilar to the structure of the territory. B) Two similar structures have similar ‘logical’ characteristics …. C) A map is not the territory.”” Please note that Korzybski is not claiming that all maps are equally false! The opposite is true. Some maps are more accurately related to the territory than others. This fundamental source of the core NLP presupposition (The map is not the territory) is often completely misunderstood in NLP literature.

Kozybski was very aware of the political implications of his proposal that words shape our thinking, and he devoted considerable energy to trying to educate the leaders of the western democracies about the rising dangers of Fascism. Let’s take an example statement from Adolf Hitler’s Mein Kampf (Hitler, 1943, p 655).

“Our task, the mission of the National Socialist movement, is to bring our own people to such political insight that they will not see their goal for the future in the breath-taking sensation of a new Alexander’s conquest, but in the industrious work of the German plow, to which the sword need only give soil. It goes without saying that the Jews announce the sharpest resistance to such a policy. Better than anyone else they sense the significance of this action for their own future.”

This statement, first written in 1925, seems to reassure people that Hitler does not intend to create an empire; merely to gain enough land for German farmers to plow. This is clearly metaphorical, since twentieth century Germany is not a Neolithic agrarian economy. The people who are afraid of this action are the Jews, Hitler says, because they will lose most from it. Hitler presupposes that he knows what “all” Jews are thinking (he says they “sense the significance…”) and even that he knows how everyone else is thinking (he says the Jews know this “better than anyone else”). He is so sure of this presupposition that he says “It goes without saying…”. He also uses ambiguously several “abstract nouns” — a category that Korzybski calls nominalizations because they turn actions into nouns by “naming” or nominalizing them — words such as “political insight”, “the sword”, “soil”, “their future”. In fact, this statement is almost entirely a collage of such words, impossible to interpret certainly in real terms. Korzybski’s analysis of such statements was the source of two of Grinder’s most powerful linguistic categories: “presuppositions” and “nominalizations”.

Information Processing Theory (Miller, Gallanter & Pribram)

George A. Miller was a neuroscientist interested in modelling how the brain makes decisions to act, and John Grinder worked at his laboratory for a year in 1969-1970. He provided two theoretical ideas that are fundamental to cognitive psychology and the information processing framework used in both computer science and in biology. Both ideas were adopted into NLP thinking. The first idea is “chunking” and the capacity of short term memory. Miller (1956) presented the idea that short-term memory could only hold 5-9 chunks of information (seven plus or minus two) where a chunk is “any meaningful unit”. A chunk could refer to digits, words, chess positions, or people’s faces. The concept of chunking and the limited capacity of short term memory became a basic element of all subsequent theories of memory, although Miller’s claim of 7±2 as the magic number in all cases has been greatly disputed by subsequent research.

The second concept Miller developed was even more central to early NLP. This was the TOTE (Test-Operate-Test-Exit) first explained by Miller, Galanter & Pribram (1960). Miller suggested that TOTE should replace the stimulus-response (proposed by psychologists such as Ivan Pavlov) as the basic unit of behavior in psychology. The TOTE then, like stimulus-response, is a way of explaining how any animal is prompted to take actions. In a TOTE unit, a goal is tested to see if it has been achieved and if not an operation is performed to achieve the goal; this cycle of test-operate is repeated until the goal is eventually achieved or abandoned. Miller’s proposal is that all our actions can be understood as the result of such goal testing and operation sequences. The TOTE concept provided the basis of many subsequent theories of problem solving (e.g., GPS) and production systems. It is the central “computing” metaphor used in NLP. Although Grinder later says it adds little of value to NLP, Robert Dilts, Judith DeLozier and Bandler and Grinder made it central to their book Neuro Linguistic Programming Volume 1 (The first attempt to write a definitive overview of the field). On page 1-2 of that book they say “The name neurolinguistics programming stands for what we maintain to be the basic process used by all humans to encode, transfer, guide and modify behavior. For us behavior is programmed by combining and sequencing neural system representations — sights, sounds, feelings, smells and tastes — whether that behavior involves making a decision, throwing a football, smiling at a member of the opposite sex, visualizing the spelling of a word or teaching physics. A given input stimulus is processed through a sequence of internal representations, and a specific behavioral outcome is generated.” This frames all human actions as TOTEs, exactly as Miller intended. The special contribution of NLP to cognitive science would thus be the combination of the TOTE model with an understanding of the sensory systems.

John Grinder also explains how Miller’s work linked into Chomsky’s transformational grammar, which was considered the theory behind his modelling of Virginia Satir’s Meta Model questions. “While the original transformational model developed by Chomsky in 1957 had been replaced/advanced by the time I got to Rockefeller, I noted that Miller was the only researcher (to the best of my knowledge) who had succeeded in operationalizing any of the key concepts in Chomsky’s work for actual detectable consequences. Chomsky had never claimed a “psychological reality” for the abstract formal work in syntax that he had pioneered. Miller blew right past this caveat and succeeded in finding, for example, longer processing times for sentences that differed only in the number of postulated transformations connecting Deep Structure with Surface Structure.”

Countercultural “Shamanism” (Carlos Castaneda)

The reductionism of the TOTE (claiming that all human experience is able to be reduced to sequences of internal sensory representations) is either taken to its extreme or profoundly counterbalanced by the association of the NLP developers with the theories of Carlos Castaneda. Starting with “The Teachings of Don Juan” in 1968, Castaneda wrote a series of books that describe his supposed training in shamanism, particularly with a group whose lineage descended from the Toltecs. The books, narrated in the first person, relate his experiences under the tutelage of a man that Castaneda claimed was a Yaqui Indian “Man of Knowledge” named don Juan Matus. His 12 books have since been found to be fiction, and Casteneda’s doctorate at UCLA was withdrawn as a result, but supporters claim the books are either “true” in some deeper sense, or at least valuable works of philosophy (Chavers, 2011). Native American writers understandably find his work highly offensive.

Robert Dilts explains: “Another significant influence on Bandler, Grinder, and the members of the “Meta” group at that time were the writings of Carlos Castaneda. Castaneda’s works provided explicit descriptions of different states of consciousness, and outlined specific steps to achieve perceptual flexibility and explore the relationship between conscious and unconscious processes. They relate his experience of the visionary reality of the Native Americans guided by the character of Don Juan, a Yaqui Indian who introduced Castaneda to different states of consciousness by means of hallucinogens. For instance, some of the interactions between Don Juan and another Yaqui “sorcerer,” Don Genaro, inspired Bandler and Grinder to create the NLP “double induction” process.””

Like their interest in revolutionary left wing politics, and their equalitarian living arrangements which blurred any professional boundary between students and professors at the University of California, the interest of the NLP developers in the effects of hallucinogenic drugs can be seen as a part of the late 1960s cultural milieu in which NLP evolved. It is nonetheless a significant part of the theoretical mix that created NLP. The fact that Castaneda falsified his experiences and thus misappropriated the cultural experiences of Native Americans, claiming to be an expert in a field and a culture that he had only minimal experience of: this presages later NLP shamanic fusions such as Tad James development of Huna, now known to be a pseudo-Hawaiian system. Milton Erickson had considerable contact with the famous writer and psychedelic researcher Aldous Huxley, of course, and together they discussed his hallucinogenic experiences which form the basis of his book “The Doors of Perception” (1954). Psychedelic experience is increasingly acknowledged as an important opportunity for us to learn about consciousness itself, and I am certainly not dismissing the practical benefits of it in psychiatry and wider human experience. I am merely urging that we consider where we get our theories about what it means.

Modelling Carlos Casteneda’s work is a project analogous to Robert Dilts’ modelling of the fictional character Sherlock Holmes. We can indeed learn much from it as long as we are clear that the original model does not have the prestige of being tested in the real world. Since the model itself draws into question the existence of that “real world” it may seem a moot point, but the original project of NLP was stated to be studying the distinctions that distinguish excellent performers from average performers, not excellent fantasists from average performers. Casteneda’s work is based on a deliberate deception, and NLP’s admiration for it brings our work into question for similar reasons. Sadly, the pragmatism that seems so liberating has here met its ultimate absurdity: truth does not matter — it is merely a metaphor. Where is Grinder’s metamodel “bullshit detector” when we need it!

Inductive Learning

If the contact of the NLP developers with hallucinogenic drugs was based on a pragmatist experimentation, the same can be said of their entire teaching/learning model. John Grinder says “Eicher offers a detailed (and remarkably accurate) description of the substance and style with which Bandler and I conducted seminars. I invite trainers purporting to train NLP patterning to decode this description with an eye to the design of training — in particular, the inductive approach to learning.” Inductive learning, also known as discovery learning, is a process where the learner discovers rules by observing examples. This is different from deductive learning, where students are given rules that they then need to apply. In fact, the flippancy with which Richard Bandler (apparently untrained and certainly professionally uncertified) ran “Gestalt Therapy” encounter groups in “the early days” seems to have been not so much designed to generate inductive learning from observing examples of the desired result, but from observing random experiments. Frequently it is hard to see what “rules” were being discovered through this process.

In Die Philosophie des Als Ob, the philosopher Hans Vaihinger argued that human beings can never really know the underlying reality of the world, and that as a result we construct systems of thought and then assume that these match reality: we behave “as if” the world matches our models. This seems rather well aligned with postmodern thinking (discussed below), but its expression in early NLP seems something quite different. John Grinder uses this philosophical explanation to rationalize the experimental approach in the early NLP groups (Grinder and Pucelik eds, 2012) “The three of us confronted the world about us in a manner and with a style perfectly consistent with the philosophy of Hans Vaihinger. Bandler introduced Vaihinger’s work into the mix and while it played little part in shifting our actions at the time, it was a fine explication of what each of the three of us had practiced unconsciously and intuitively for years. It became an explicit platform for continuing in our pursuit of patterning that would ultimately become NLP. It was an open ticket to ride and we continued to do so, delighted to explore as living embodiments of Vaihinger’s bold philosophy.” I may be paraphrasing unfairly: to me it does sound as if the idea was often that NLP developers would pretend to know what they were doing, and pretend to understand the signals they were observing, when they did not.

Stephen Gilligan calls the people involved in NLP “merry pranksters” and David Wick calls them “wild and crazy people” (In Grinder and Pucelik eds 2012). I guess I was involved with very similar people in the University counter-culture in New Zealand at the same time. Breaking out of the straight-jacket of traditional education required innovation, and I also “trained” in running Encounter Groups (see below) with the Leadership Skills Laboratories in New Zealand at this time. I have since trained as a teacher in the New Zealand College of Education system, and I don’t think what happened in the early NLP community really fits the model of inductive learning as well as John Grinder implies.

I quote from Michael Prince and Richard Felder (2006), whose article “Inductive Teaching and Learning Methods: Definitions, Comparisons, and Research Bases” gives a far more realistic view of this field: “In practice, neither teaching nor learning is ever purely inductive or deductive. Like the scientific method, learning invariably involves movement in both directions, with the student using new observations to infer rules and theories (induction) and then testing the theories by using them to deduce consequences and applications that can be verified experimentally (deduction). Good teaching helps students learn to do both. When we speak of inductive methods, we therefore do not mean total avoidance of lecturing and complete reliance on self-discovery, but simply teaching in which induction precedes deduction. Except in the most extreme forms of discovery learning (which we do not advocate for undergraduate instruction), the instructor still has important roles to play in facilitating learning–guiding, encouraging, clarifying, mediating, and sometimes even lecturing. We agree with Bransford: “There are times, usually after people have first grappled with issues on their own, that `teaching by telling’ can work extremely well.””

Prince and Felder’s model does indeed place Induction in the first half of the 4MAT system (answering “Why” we need a theory in real experiences precedes presenting the theory in effective education). Bernice McCarthy’s 4MAT system (McCarthy, 1987), suggests that effective teaching moves through a cycle of four stages with each topic. These stages are represented by different learner questions: Why?, What?, How?, and What if? McCarthy suggests that it’s useful first to answer the question “Why are we learning this?” Many students will not be able to engage with the learning process until they feel they have a reason, or a motive for learning. Once this question is answered, the next step is to give the information about the topic; answering questions about “What?”. Only then, when the core information has been imparted, can participants usefully move on to experiment with the subject, doing an exercise or observing a demonstration. Such interaction answers the question “How [do we use this]?” Once they have a practical experience, then participants can successfully explore questions which consider “What would happen if?” –for example “What would happen if this didn’t go the way we expected?”, “What would happen if I use this in a different setting?”

This kind of learning cycle is well understood in educational theory and builds on the previous work of David Kolb. The early NLP groups are not so much examples of this 4MAT model as they are examples of “T-Group”, “Encounter Group” or “Sensitivity Group” learning experiences. These experiential “learning” processes, used by Gestalt therapists and others in the 1960s especially, were a polar reaction to the lecture format. A T-group meeting does not have an explicit agenda, structure, or expressed goal. The concept of encounter was first articulated by J.L. Moreno in Vienna in 1914–15, in his “Einladung zu einer Begegnung” (“Invitation to an Encounter”), maturing into his psychodrama therapy model. The risk of T Groups, pointed out by researchers, is that without any clear goal or even established protocols, dominant personalities can evoke often personally disturbing experiences without taking any responsibility for the results.

Terry McClendon elegantly discussed the real risks of this approach in his book “The Wild Days” (1989). At Christmas of 1974 Richard Bandler announced an annual gift-giving to the NLP group. “Richard was in the middle of the room and he would say, ‘Who would like to have their gifts first?’ Devra would always like to get hers first. She suggested she get her present to begin with and was very happy and excited.” McClelland recounts how Devra was sent from the room and 10 people, including him, were chosen to serve as actors and actresses.

“We were all given white sheets to wear and candles to hold and we walked out to the side deck of the house and were we surprised!” writes McClelland. “Because on that deck was an 8-foot-tall cross like the one that Jesus was crucified on.” “Devra was led out blindfolded and was then tied onto the cross,” which was then set ablaze at the bottom with lighter fluid. “Devra at the time began to smell smoke and was wondering what was going on,” writes McClelland. “Richard asked her if she would like to have her gift now. She said she would and so Richard took her blindfold off and gave her a knife which she could then use to cut herself off the cross.” McClelland notes that Devra never forgave Bandler, in spite of his subsequent attempt to explain “how she could learn from the experience.”

When I trained in NLP back in the early 1990s (with Terry first of all), these stories about the encounter group learning were told as humorous NLP metaphors. They are not funny, and they are not some admirable model of how learning “should happen”. They are examples of sloppy educational theory being used to justify personality problems.

Postmodernism and Deconstruction

One of the main social theories developed since the creation of NLP is Postmodernism, a broad movement that developed in the mid- to late 20th century across philosophy, the arts, architecture, and criticism, marking a departure from modernism. Modernism, of course, was never really an official movement, but by 1960, there was a strong feeling in most social sciences that humanity had reached a rational unbiased conclusion to the long history of superstition, and that “modern” thinking was the supreme end of theoretical development. This belief in our time as an evolutionary endpoint is what is being termed “modernism” by postmodernism.

Postmodern thinkers frequently call attention to the contingent or socially-conditioned nature of knowledge claims and value systems, situating them as products of particular political, historical, or cultural discourses and hierarchies. Accordingly, postmodern thought is broadly characterized by tendencies to self-referentiality, epistemological and moral relativism, pluralism, and irreverence. As such it critiques progressive theories such as Social Evolutionism (e.g. Robert Dilts’ Neurological levels, Clare Graves’ Spiral Dynamics model) which claim that predictable evolutionary “upgrades” or “levels of awareness” occur in human history or experience. It also critiqued the Structuralist idea that existing social relations (such as cultures and religions) are intrinsically functioning systems which should be accepted rather than critiqued. Jacques Derrida, Michel Foucault, Jean-François Lyotard, Jean Baudrillard, and other 1970s poststructuralists in France are usually considered the core postmodern thinkers.

Noam Chomsky considered postmodernist writing to be “incomprehensible”, and indeed in its original French presentation it does seem hard to make practical use of the philosophy. “Quite regularly, “my eyes glaze over” when I read polysyllabic discourse on the themes of poststructuralism and postmodernism; what I understand is largely truism or error, but that is only a fraction of the total word count.” he says in his article “Rationality/Science”. However, one of the most well-known postmodernist tactics that we can use practically is “deconstruction,” a method of philosophy, literary criticism, and textual analysis developed by Jacques Derrida. Derrida urges us to pay attention to a text’s unacknowledged reliance on metaphors and symbols. His method often involves demonstrating that a given philosophical idea depends on black and white binary oppositions or on deliberately excluding terms that the discourse itself has declared to be irrelevant or inapplicable. The NLP metaprogram categorisations would be examples of binary oppositions, and the “programming” metaphor would be a good example of an assumed metaphorical stance within NLP.

Lisa Wake sees hints of a postmodern approach in the early NLP experimentalism, and says “NLP is not a pattern that can be taught: NLP is learning and discovery.” (2008, p 161, 174) She adds “If neurolinguistic psychotherapists stay within a programmatic model of working they are not honoring Erickson’s work, and most importantly are staying out of rapport with a client’s unconscious…. By utilizing the principles of systems theory, it is important that clients are encouraged to access their unconscious and view their subjective experience from within the principles of cybernetics.” She quotes the psychoanalyst Allan Schore as warning that “Cycles of organization, disorganization, and reorganization of the intersubjective field occur repeatedly in the treatment process.” That is to say: psychotherapy (as opposed to one time interventions) is intrinsically postmodern and deconstructs the client’s reality repeatedly. This, I guess, is indeed much closer to the experience that Gestalt therapy offered, but I’m not sure if it is true to Erickson’s model which can also be viewed as rather more strategic.

Deconstruction might also have saved NLP from some of the theoretical confusion that I discuss above, and certainly could caution us about what postmodernists call “Evolutionism” — the idea that there is a grand plan or “metanarrative” behind history (Johnson, pp 126, 187-196). That in itself would lead us to reject theories such as “Spiral Dynamics” developed by followers of Clare Graves. Graves contended that there was a predictable unfolding of human “values systems” through history, ‘new brain systems may be activated and, when activated, change his perceptions so as to cause him to see new problems of existence.’ Instead of beginning only as passive hardware without content (Locke’s tabula rasa or blank slate view), it turns out the normal human brain comes with potential ‘software’-like systems just waiting to be turned on -latent upgrades!” (Beck and Cowan, 1996, p51). This is an excellent example of what postmodernists are referring to as “modernism”. Don Beck and Chris Cowan, before they fell out with each other, referred to themselves as spiral wizards who stood above this grand scheme of history and could help others evolve through it. This pretence that the “map” and “territory” distinction can be transcended by evolved spiral wizards is still in need of deconstruction through much of NLP. It is, as I have said elsewhere, as naive as the grand plan of Marxism that suggests that just as Slave societies “naturally” evolve into Feudal societies and Feudal societies “naturally” evolve into Capitalist societies, so in the near future Capitalist societies will “naturally” evolve into Socialist societies; or the grand plan of what was once called “Whig” history that suggests that societies are increasingly and relentlessly moving towards freedom and democracy; or the grand plan of Christian theorists in early Rome that the very existence of the Roman empire is part of God’s grand plan to spread Christianity to every part of the world, inevitably ushering in the kingdom of Heaven.

Why Theory Matters

Finally, theory matters because practice matters, and all practice is infused with theory, either consciously or unconsciously. The lack of interest in coherent theory is part of a story of NLP that has seen it dismissed as “pseudoscience”. Wikipedia currently points out “Scientific reviews state that NLP is based on outdated metaphors of how the brain works that are inconsistent with current neurological theory and contain numerous factual errors.” I am as frustrated as any NLP Practitioner with the ongoing failure of Wikipedia to acknowledge the research into NLP based therapeutic processes. However, I agree with much of its critique of NLP as theory.

Consider the above quoted claim by John Grinder that with NLP we are witnessing a new paradigm emerging. Wikipedia notes “The philosopher Robert Todd Carroll responded that Grinder has not understood Kuhn’s text on the history and philosophy of science, The Structure of Scientific Revolutions. Carroll replies: (a) individual scientists never have nor are they ever able to create paradigm shifts volitionally and Kuhn does not suggest otherwise; (b) Kuhn’s text does not contain the idea that being unqualified in a field of science is a prerequisite to producing a result that necessitates a paradigm shift in that field and (c) The Structure of Scientific Revolutions is foremost a work of history and not an instructive text on creating paradigm shifts and such a text is not possible–extraordinary discovery is not a formulaic procedure. Carroll explains that a paradigm shift is not a planned activity, rather it is an outcome of scientific effort within the current (dominant) paradigm that produces data that can’t be adequately accounted for within the current paradigm–hence a paradigm shift, i.e. the adoption of a new paradigm. In developing NLP, Bandler and Grinder were not responding to a paradigmatic crisis in psychology nor did they produce any data that caused a paradigmatic crisis in psychology. There is no sense in which Bandler and Grinder caused or participated in a paradigm shift. “What did Grinder and Bandler do that makes it impossible to continue doing psychology…without accepting their ideas? Nothing,” argues Carroll.”

Much of what passes for theory in NLP is simply the dropping of names (of theorists and theories) such as Gregory Bateson, George Miller, Milton Erickson, Cybernetics and Inductive Learning. This essay urges you to go back to the sources of these terms and people, and find out why Milton Erickson called Bandler and Grinder “Bandit and Swindler”, and why Gregory Bateson referred to their “modelling” of Milton Erickson as “sloppy epistemology”. It is largely because the people the NLP developers studied actually knew the theory behind what they did. At times, quite clearly, Bandler and Grinder did not. I say this with great gratitude to Bandler, Grinder and the other NLP developers, whose work I admire and have based much of my career on. We can admire the story of NLP without idealizing it. I think anyone who actually studies social theories such as cybernetics, inductive learning, shamanism, information processing, linguistics, and general semantics is struck by the almost cartoon-like simplification of these in the NLP field. My hope is that I can create enough cognitive dissonance here for you to begin going back to the social theory literature to study it.

Bateson, G. (1991) A Sacred Unity, New York: HarperCollins
Bateson, G. (2000) [1972]. Steps to an Ecology of Mind: Collected Essays in Anthropology, Psychiatry, Evolution, and Epistemology. Chicago, Illinois: University of Chicago Press. ISBN 0-226-03905-6.
Beck, D.E. and Cowan, C.C. (1996) Spiral Dynamics, Oxford: Blackwell
Castaneda, C. (1998) The Teachings of Don Juan: A Yaqui Way of Knowledge. Berkeley: University of California
Chavers (2011) “The Fake Carlos Castaneda” https://newsmaven.io/indiancountrytoday/archive/the-fake-carlos-castaneda-S3M9DbK1OUuu5XH-tIfQtg
Chomsky, N. “Rationality/Science” (1995) from Z Papers Special Issue, 1995 on line at http://www.zmag.org/chomsky/articles/95-science.html
Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT Press. ISBN 0-262-53007-4.
Chomsky, N. (2002) Understanding Power: The Indispensable Chomsky New York: The New Press
Dilts, R., Grinder, J., Bandler, R. and DeLozier, J. (1980) Neuro-Linguistic Programming: Volume 1 The Study of the Structure of Subjective Experience, Cupertino, California: Meta Publications
Elliot, A. (2005) “Psychoanalytic Social Theory” pp 175-195 in Harrington, A. ed. Modern Social Theory: An Introduction Oxford: Oxford University Press
Grinder, J. and Pucelik, F. eds. (2012) The Origins of Neuro Linguistic Programming, Bancyfelin: Crown House
Hall, M. (2011) “NLP Presuppositions from Korzybski” at https://www.giannicoladeantoniis.com/nlp-presuppositions-from-korzybski-by-michael-hall-neuro-semantics/
Hall, L. M. (1995) The Spirit of NLP, Carmarthan: Anglo American Book Company
Hitler, A. (1943) Mein Kampf Houghton Mifflin, Boston
Holmwood, J. (2005) “Functionalism and its Critics” pp 87-109 in Harrington, A. ed. Modern Social Theory: An Introduction Oxford: Oxford University Press
Hunt, M. (1993) The Story of Psychology, New York: Doubleday
James, W. (1950) The Principles of Psychology (Volume 1 and 2), Dover, New York.
James, W. (1961) The Varieties of Religious Experience, Collier, New York.
Joas, H. and Knöbl, W. (2009) Social Theory: Twenty Introductory Lectures, Cambridge: Cambridge University
Johnson, M. (2020) Archaeological Theory: An Introduction Hoboken, NJ: Wiley Blackwell
Korzybski, A. (1994). Science and Sanity: An Introduction to Non-Aristotelian Systems and General Semantics (5th ed.). Brooklyn, NY: Institute of General Semantics.
Kuhn, T. (1970) The Structure of Scientific Revolutions Chicago, IL: University of Chicago Press
Lash, S. (1990) The sociology of postmodernism London, Routledge
McCarthy, B. (1987) The 4MAT System Barrington, Illinois: Excel
McClendon, T.L. (1989) The Wild Days Cupertino, California: Meta
Miles, L. quoted at https://news.softpedia.com/news/You-Lean-Forward-When-Thinking-of-the-Future-133599.shtml
Miller, G.A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81-97.
Miller, G.A., Galanter, E., & Pribram, K.H. (1960). Plans and the Structure of Behavior. New York: Holt, Rinehart & Winston.
Prince, M., and Felder, R.M. (2006). Inductive teaching and learning methods: Definitions, comparisons, and research bases. Journal of Engineering Education 95 (2): 123–38
Wake, L. (2008) Neurolinguistic Psychotherapy: A Postmodern Perspective, Hove. East Sussex: Routledge