The controversy of studying Psi—social or scientific?

An informal exploration

Personal note: I don't actually have a stake in how this particular debate comes out, but I do have a bias against poor science, and "agreement-based" science.

Overview:

When "unusual" results are presented, such as those by Radin and by Bem, the howls of protest are vigorous. Does one listen to the researchers and their supporters, or to their critics? How does one go about digging deeper to tease out what are facts, what are scientifically supported opinions, and what are non-scientific criticisms driven by some other agenda?

General practices to consider are: (a) perform your own studies to add to the literature if you can, tweaked to answer your own specific questions, (b) look at the data, and the valid analyses of it, (c) look at the interpretations of the way the data has been analyzed, (d) look at the quality of work done by (1) the researchers who are reporting, and (2) the institutions and collaborators with whom they are associated, (e) look for repeatability by others, (f) adhere strictly to the rules of science including odds against chance, and effect size, and (g) discount the often harsh criticisms that do not meet the same standards as the original work.

What to watch for in critiquing the critiques: Those who say, "It can't be, therefore it isn't." Those who engage in personal attacks on the researchers. Those who make unsupported, or ill-researched claims about what was done. Those who claim that "what could have happened" in the running of an experiment is what did happen, without bothering to check with the original researchers. Watch for unsupported biasing phrases such as, "As everyone knows..."

Dean Radin

First, let's look at the publication record of Radin. Here is selected collection of publications . Next we'll look at a bibliography composed from the notes in two of Radin's books - that is, articles which he has studied, and which he uses to support his work.

Second, let's look at some of the endorsements for Radin's work, which include those by two Nobel Laureates and a national book award.

Criticisms

Now let's follow the path of one of the criticisms, which is unfortnately quite common.

Here is the original (but annotated) I. J. Good Book Review of Radin's Book Entangled Minds , published in Nature . Overall the review is actually quite positive, but does make the claim that one of the main techniques that Radin uses is flawed. In addition reference to the (quite usual) claims of fraud and the file drawer effect (both of which are extensively treated in the actual text of Radin's book) are made. Most importantly, regarding the meta-analysis that Radin uses, Good claims that Radin's work is to be discounted because while Radin claims there would have to be 3,300 file drawer studies ignored and unpublished for every successful psi experimental result that was published [that is, the file-drawer effect, unpublished studies showing no results still sitting in file drawers somewhere], the actual number is only 15 file-drawer studies, or even as low as 8. He gives his analysis showing why his number is correct.

Good closes by saying he would believe in psi if someone were to post cricket scores on the web ahead of time, which is odd, because there has never been any scientific claim of such ability, or anything like it, by serious researchers, and has nothing to do with the work of Radin.

Good's review has been widely used to discredit Radin and his work, and still appears on numerous "Skeptic" websites to this day. Few if any of these websites point out that Good says, for example, that Radin's book "...provides a good summary of the arguments supporting the existence of ESP, with about 600 references." [That is, six HUNDRED scientific studies!] Nor do the skeptical websites point out that Radin makes dozens of strong, well-supported claims in the book and that Good has only found fault with one. They tend to say only that Good has shown Radin to be a poor scientist that does not understand the math. I myself have been told (based entirely on the Nature criticism) that it is "...very disappointing that a college professor would push pseudoscience onto his students."

But there is more to the story, which can be read in the trailing link below. A summary of the contents: In fact, others, in letters to Nature, pointed out a clear error that Good made, showing that, in fact, Radin is absolutely correct in his analysis. Radin himself wrote a gracious letter to Nature with a clear explanation of Good's error. For some strange reason, Nature acknolwedged that Radin was correct, but refused to print a correction, suggesting that the review was positive anyway. Finally, Nature did print the correction, but only much later, and (?) apparently only after a change of leadership at the publication. Here are some notes on the Rebuttal.

Is this the end of the story? Well, no. There are often rebuttals to the rebuttals, and etc., producing an ongoing dialogue that is good for science. But the disturbing nature of the rethoric, and the common reporting of un-supported criticisms is common when unusual results are presented, and this is not good for science. If our best researchers are driven away from research that produces controversial results, who is left to do the work?

Here is another criticsim of one of Radin's books: DeBakesy. This is from a pro-skeptical website and as such is looking for evidence in support of skepticism. I don't see a Radin response. At some point Radin will have stopped bothering. Among other things, this review raises a valid criticism, if true and complete (which with deeper investigation these rarely seem to be), about Radin reporting about Honorton and Ferrari removing outliers from their data, Radin not doing so and not mentioning this in reporting their work. In the end while this drops the effect size from 0.02 to 0.012 the results are still signifcant and interesting, so the argument is a gripe, but not actually a refutation of the data. That is, Radin is attacked for being irresponsible in his reporting, but not otherwise incorrect. For meta-analysis, it seems that leaving out the outliers would attract just as much criticism.

Another puzzling criticsm of one of the studies seems to miss the whole point of the experimental design of the 653 Princeton University free-response studies, wherein coders are asked to associate subject responses with randomly chosen images, based on "senders" viewing a selection of original images or real-time viewed locations and "sending" to them to the subjects who then describe them. That the free-response text passages are not a perfect match with the original pictures is completley irrelevant—the coders simply have to do their best to match what was said with the original images, which they are able to do with relatively staggering odds against chance. So, to my mind this reviewer is goofy.

Other criticisms are raised, such as missing error bars, about which I have no comment one way or the other: if correct, good job catching this; if not correct, shame for not asking Radin to explain before publishing. Valid general comments are made regarding researcher bias, which is a well-known effect, though which criticism applies to virtually all data-collecting research.

Daryl J. Bem

(This is still a work in progress. Here are some quick links in the meantime.)

Bem's Vita.

Bem (highly respected, ivy league psychologist researcher), published a report on nine studies (~400 subjects) in which eight of them showed evidence of the existence of psi in various ways. He used an unusual method of "reversing" existing, accepted experimental psychology methodologies, so that his design could not be refuted. His results were completely consistent (odds against chance / effect size / etc.) with other psychological studies. Of particular interest to a reader may be the details of the arguments with Alcock, showing the extent of the efforts Bem went to to be correct in his design, and reporting.

Ref: "Feeling the Future: Experimental Evidence for Anomalous Retroactive Influences on Cognition and Affect," Journal of Personality and Social Psychology (JPSP).

One "reversed" design that Bem used, is as follows: "...In Bem's back-to-front version, participants were shown a list of words and then asked to recall words from it. Later Bem showed them words randomly selected from the same list, and it turned out that they had been better at recalling these words in the prior test. The subsequent display seemed to have influenced their earlier memory." (from Aldhous (below)).

Criticisms

One of the repeating themes is a variation of "it can't be so it isn't." In this particular variation, those raising criticism call for the use of a certain kind of Bayesian analysis using an a priori value intended to call for unusually strong results to support unusual theoretical claims. The general idea is, if something has been shown to be false before, then you change a variable (which is, roughly, factored into the results) so that it is harder to prove that the thing now IS true. The general claim is that this is more scientific, and adds rigor to the idea of "common understanding" of the way things work. The problem is, absent a complete review of the existing literature and prior work in the area, this value can end up being completely arbitrary, making it difficult or impossible to show results. You'll see this theme in the following articles.

Another theme is that despite Bem's work not (yet? [apparently several failures some success as of 2015-02-18]) being repeated by others, a great number of very serious research institutions, and research publications take his work seriously enough to print follow-ups.

WagenMakers
Storm
Galek Nelson
Galek
Structure of Debate Aldhous / Wiseman / Smith commentary.

One of the serious responses to Bem (Wagenmakers, above) appears to be so attached to the idea of "it can't be so it isn't" that the way the authors reconcile Bem's results is to admit that his work did indeed follow every rule of psychological research, so the only conclusion is that all of the psychological results based on such testing and analysis must now be invalidated [which includes much (most?) experimental work in the last century], and that we have to change the way we study psychology.

This might be a valid result, but one that is dramatic.

Also included in the above is an article entitled "Journal rejects studies contradicting precognition," but if you actually read the article the reporter (Aldhous) also notes that the same journal rejected work that supported precognition, "[editor]Smith defends the decision, noting that he made the same ruling on another paper that, by contrast, supported Bem's findings." And Aldhous reports that both rejections are simply a preference of the journal for not reporting replication studies, leaving that for one of many hundreds of other journals.

This raises three important points: (a) why is there a bias in the headlines that the journal rejected papers with different results? (b) why is this essentially non-article with the inflammatory headline so highly indexed?, and (c) The main critic of the work is calling for a change in the structure of how results are reported in the scientific community. Yet again a dramatic social result of the original Bem work.

Some relevant links

Bem's response to a critical review of the original Bem paper, by Alcock
A pot potpourri of URLs.
Critical URLs