4. EPIDEMIOLOGY AND POWERLINE EMF HEALTH HAZARDS
A fair and honest doubt exists about the safety of powerlines growing out of the EMF epidemiological studies.
Historically, the methods and procedures of epidemiology have worked well in identifying and characterizing health risks due to infectious agents. Epidemiology has also successfully identified risks due to some non-infectious agents, including the links between smoking and lung cancer and between thalidomide and birth defects.
The first epidemiological study that considered the possible health menace of powerline EMFs was performed by Dr. Becker in the early 1970s. He found an association between environmental EMFs and cancer, and interpreted it to generally support the stressor hypothesis regarding the mechanism of action of EMFs. Subsequently, many hundreds of studies were performed and interpreted to support many different opinions concerning the health menace of powerline EMFs. In this Section, I will describe how the EMF epidemiological studies were performed and evaluated. I will show that the scientific meaning and public-health significance of the EMF epidemiological studies depends entirely on the evaluative criteria utilized to individually and globally assess the studies.
Clinical Study Standards: Randomization
I served on the Institutional Review Board (IRB) of the LSU Medical School for many years, including 5 years as Chairman. During that time, I read several thousand applications that were made to the IRB for permission to conduct human experimentation. Although the purposes of the studies varied, most were clinical studies aimed at determining whether a particular drug or device was effective in treating a particular disease. Typically, the plan of study was approved by the Food and Drug Administration which stipulated that if the study was performed as proposed, and if the data obtained was as expected, then the existence of a cause-effect relationship between the study drug or device and an improvement in the disease could validly be inferred.
A fundamental aspect of the studies approved by the IRB was the use of randomization of study subjects to treatment or control groups. Statistical methods used to evaluate the data were based on the assumption of randomization, and the conclusion of a cause-effect relationship was based on the statistical evaluation. In contrast to clinical studies, EMF epidemiological studies never used randomization of subjects because a randomized trial to assess whether EMFs affect human health is ethically impermissible.
The lack of randomization in EMF epidemiological studies had serious consequences with regard to what could validly be inferred. For example, suppose that the risk for cancer in an EMF-exposed group was found to be greater than the risk in the control group. The salient question would then be whether the association of increased risk with EMF exposure was a cause-effect relationship, or as a mere association such as that between stock-market prices and hemlines. In the absence of randomization, it is impossible to have reasonable assurance that no factor was associated with both EMF exposure and cancer, and that this factor, not the EMF, was the true cause of the disease. If that were the case, then the correlation between EMFs and the disease could not validly be interpreted to indicate a cause-effect relationship. Because there are always many such potential causes, an observed increased risk in an epidemiological study could equally be explained as the result of an uncontrolled factor. Similarly, it is always possible that a finding of no increased risk could be equally explained by a failure to control for a pertinent risk factor. It follows that every EMF epidemiological study is intrinsically inconclusive to some degree. It is a matter of human judgment whether the degree of uncertainty in particular studies or groups of studies is sufficient to warrant a particular conclusion. Reasonable people may differ regarding this judgment.
Other Clinical Study Standards
Aspects of clinical study designs other than randomization also contribute to their reliability. Many of these design features are possible in epidemiological studies, but they have rarely been incorporated into the design of EMF epidemiological studies.
Two of the missing features are particularly important. First, every approved clinical study has an experimental hypothesis. Usually, the hypothesis is that a particular drug, device, or surgical procedure will be more efficacious than a suitable control, and the purpose of the study is to evaluate the hypothesis. The hypothesis is stated before the data is analyzed and is usually based on laboratory results that provide some basis for concluding that the study has merit and is worth the risk of exposing human beings to novel situations. A statistical test closely associated with the experimental hypothesis is used to objectively assess whether or not the experimental hypothesis was supported. In the absence of a hypothesis of some kind, one could have no confidence that statistical associations found in the data after it was collected were causal. They could be, but there is simply no basis for deciding.
Second, in a clinical study the drug is administered only to the patients in one of the two study groups. The second group, the controls, receive the same degree of attention, but they do not receive the study drug, and consequently can serve as the basis for evaluating its effect. Further, the dose of the drug is recorded so that it is possible to identify which patients received the drug, and how much they received. If the investigator could not determine who did and did not receive treatment, how much treatment was received, and whether treatments other than the study treatment were administered, then assessment of the effect of the drug would be impossible.
As discussed below, these routine and fundamental features of clinical studies are absent in EMF epidemiology studies.
Epidemiological studies are traditionally divided into three general groups based on the timing of the identification of the case and control subjects. If the cases (subjects with the disease of interest) are identified prior to the control subjects, then the statistical comparison will involve a determination of whether the cases have a greater risk of exposure, and the approach is a case-control study. If subjects having or not having the exposure are identified first and then followed to determine the incidence of disease, the procedure is a cohort study (a metaphorical reference to a Roman military cohort which always moved forward, never backward). If the cases, controls, and exposures are identified at the same time (such as in the analysis of a list of persons who died from various causes subdivided into occupations), the procedure is a cross-sectional study. In this report, the focus is on epidemiological methodology itself, rather than on the less important issue of the implications of differences in epidemiological designs. Consequently, the studies are discussed without regard to the particular epidemiological design employed.
Absence of Hypotheses in EMF Epidemiological Studies
In a study by Wertheimer and Leeper (WL), the cases were children who died with cancer, and the controls were normal children identified from birth certificates. The relationship between various predetermined classes of powerlines and the birth and death residences of the two groups was determined, and more than the expected number of cancer cases occurred among the subjects who lived near the powerlines.
For reasons never made clear, WL decided that the aspect of powerlines that might be linked with human disease was the magnetic field. Nothing prior to their study reasonably suggested that the magnetic field might be an etiologic agent, and in fact most animal studies had been performed using electric fields. Nevertheless, they chose magnetic fields for study, and constructed a coding system for identifying whether particular powerlines did or did not give rise to magnetic fields at the residences of the study subjects (WL wire codes). Subsequently, evidence of the validity of the WL codes as a surrogate for EMF exposure was provided by measurements showing a relation between field strength and the coding system.
WL never explained why they chose to study cancer in relationship to magnetic fields rather than say diabetes or arthritis or mental retardation. Because there was no study hypothesis, no basis for studying magnetic fields, and no reason to choose cancer as an endpoint, it seems fair to characterize the WL study as the investigation of a subject (potential association of magnetic field exposure coded by a particular visual identification system, with the occurrence of childhood cancer), rather than a scientific study to test a specific hypothesis. They had an obvious interest in and aptitude for their subject, and because they paid for the study themselves, they were not required to justify its design or rationale to anyone.
The cause-effect relationship suggested by the association found by WL has great public-health significance because, despite an unprecedented degree of attention by the power companies who commissioned many similar studies, the apparent correlation discovered by WL has continued to stand up. But the absence of a hypothesis - whether or not justified under the circumstances that prevailed in 1979 - led to numerous subsequent EMF epidemiological studies that also had no hypothesis. The resulting confusion significantly obscured the landmark status of the WL coding system and the public-health implications of their findings.
For example, in their next study they chose a control group that contained dead subjects. Again, they stated no explicit hypothesis but the hypothesis actually tested by the statistics they employed was whether EMF exposure was more likely among people who died from cancer compared with a mixed group of controls, some of whom died from diseases other than cancer. The assumption cannot be made that the controls provided an unbiased estimate of the prevalence of EMF exposure among the general population (which might be a reasonable assumption for a normal control group, as they used in their first study). Thus, the implicit hypothesis in the two WL studies are different, and possibly inconsistent.
In a subsequent study based in Seattle, no significant relation between acute non-lymphocytic leukemia and EMFs was observed. Although the authors used the WL wire codes for identifying EMF exposure, the choice of non-lymphocytic leukemia as an endpoint was arbitrary and unjustified by any prior work. The authors seemed to suggest that there was some relationship between their study and those of WL in the sense that a statistical association in the Seattle study would have strengthened acceptance of a causal association in the WL study. It is difficult to understand why they thought that should be the case. Although WL never recognized it, their choices of all cancer as the endpoint and a normal control group (in their first study) was the ideal design to test the stressor theory of EMF-induced disease. On the other hand, there was no rationale whatever for the investigators in the Seattle study to limit the study to a particular histological sub-type of cancer.
In a study based in Rhode Island, a unique coding system for identifying the presence of magnetic fields was used, and no link with childhood leukemia was found. The authors seemed to say that their study was pertinent to the WL study, though the chosen endpoint was childhood leukemia, not childhood cancer as in the first WL study. The authors of the Rhode Island study were clearly impressed that WL found a statistical association between childhood leukemia and wire codes when they searched through their data. But this association was not a planned comparison by WL, and therefore could not be used to conclude that magnetic fields and childhood leukemia were associated. It is always possible to rummage through data already collected to find unplanned statistical associations. The implicit hypothesis of the Rhode Island study seems, therefore, to have been related to an impermissible inference from the original WL study. It is difficult to be certain, however, because the authors of the Rhode Island study stated no hypothesis.
In a Los Angeles study, childhood leukemia was considered in relation to magnetic fields as indexed by the WL codes, 24-hour measurements, and spot measurements. An association with magnetic fields as indexed by the codes was observed, but not as indexed by the other surrogates. Because there was no hypothesis, the study seems best characterized as a historical narrative in which the author described a series of actions that led to various kinds of data, followed by an unplanned pattern of statistical analysis of the data followed by the expression of opinions regarding the meaning of the data.
In a study involving children who lived in Stockholm, Sweden, the cases were subjects who had either benign or malignant tumors, and controls were chosen from birth records. The magnetic field at each residence was measured and a unique system for coding for the presence of EMFs from powerlines and other sources was used to examine for possible statistical associations. As might be expected, some associations were positive and others were not. Thus it is possible to argue inconsistently regarding the implications of this study, based on which statistical associations are given credence. Since none of them were specifically planned, within the context of the study, there is no clearly correct choice.
In another series of studies, dead or diseased subjects were used as controls. Consequently, it is even more difficult to identify a plausible study hypothesis. The results were as follows.
What would be the possible inferences from these studies, even assuming that hypotheses had been stated? If the EMF subjects had a particular type of cancer, say leukemia, and the control subjects had non-leukemia cancer, then the idea actually tested in a statistical analysis would be whether EMF exposure was more likely among leukemia subjects, compared with subjects who died with another form of cancer. But it is hard to make sense of this comparison because the assumption cannot be made that the subjects who developed non-leukemic cancer provided an unbiased estimate of the prevalence of EMF exposure among the non-diseased population. This is particularly true because the only plausible biological hypothesis yet proposed to explain the link between powerlines and human disease, namely the stressor hypothesis, suggests that any diseased control group will contain a higher proportion of EMF-exposed subjects, compared with healthy subjects. Because the estimate of risk in an epidemiological study involves comparisons of risks between the cases and controls, the use of a disease control group can (and probably does) lead to an underestimation of the risk of EMF exposure in the healthy population.
Epidemiological studies that employed a cross-sectional design constitute another group of non-hypothesis EMF epidemiological studies whose theoretically possible hypotheses seem irrelevant if the goal is to reasonably estimate human health risks due to EMFs.
In any plan to assess a hypothetical cause-effect relationship it is necessary to distinguish between those who did or did not receive the EMF exposure, to determine how much EMF exposure was received, and to determine who received other potentially important exposures. None of these goals were achieved in any EMF epidemiological studies. The question whether there were one or more studies where these goals were achieved sufficiently to warrant use of the studies in public-health planning is unresolved because there is nothing even resembling agreement regarding how close is close enough.
In a study in Stockholm, Sweden, for example, the investigators considered distances as great as 150 m to be within the zone of influence of powerline equipment. Not surprisingly, the mean field strength at the residences labeled as exposed was the same as that at the control residences. In an English study, persons who lived within 15 m of a transformer were classified as exposed even though transformer fields do not extend that far. The control subjects in the study were also misclassified because not living within 15 m of a transformer in England is not a good surrogate for non-exposure because most English powerlines are underground. In a Rhode Island study, occurrence of EMF exposure was predicated on the basis of mathematical calculations that seem hopelessly uncertain. In another English study, the surrogate for EMF exposure was so bizarre that less than 1% of the study subjects were exposed.
Regrettably, the later epidemiological studies have essentially the same shortcomings in design as the epidemiological studies done 10-20 years ago; consequently the later studies are no more probative. For example, Linet and her colleagues examined the relationship between powerline EMFs and acute lymphocytic leukemia in children, and concluded that the study results "provide little evidence" of a link. But the authors gave no hint of what they meant by "little" or whether the evidence, even though it was "little", was enough to, for example, warrant mandatory rules or governmental warnings about whether families with small children should live beside powerlines. Further, the Linet study had no hypothesis, and consequently the data analysis was arbitrary. The authors chose 2 mG as the dividing line between exposed and non-exposed subjects, and this made the results of the study negative. If 3 mG were chosen, however, the results would be positive.
Asymmetry in the degree of effort in classifying cases and controls also continues to occur. For example, an association between powerline EMFs and brain tumors in electric utility workers was reported. The cases were identified on the basis of cancer diagnoses reported to the health insurance system, but the controls were matched simply on the basis of year of birth. Thus, the presumption was made that unless a subject was seen by a physician, diagnosed as having cancer, and reported to the health insurance system, then the subject did not have cancer for the purpose of this study. Consequently, every case is certain but every control is problematical.
Some problems regarding inferential limitation of EMF epidemiological studies have actually worsened, occasioned by the development of computers and commercially available statistics software packages. In a study from Greece, for example, 4 unvalidated surrogates for EMF exposure were chosen and arbitrarily divided into 5 levels. The data was adjusted for 18 apparently irrelevant factors using the logistic equation, without explanation. The results of this complex design protocol are uninterpretable with reference to any identifiable standards of judgment.
Epidemiological Criteria for Causal Association
Because the EMF epidemiological studies yielded statistical associations whose implications were problematical and significantly dependent on human judgment, criteria appropriate for use in evaluating the literature to reach an overall judgment must be delineated. These criteria ought to facilitate good or valid or generally acceptable opinions regarding the implications of the EMF epidemiological literature. Unfortunately, the criteria often applied to evaluate the studies do not fulfill the obvious need for objectivity.
The difficulty in assessing the causative role of environmental factors in human disease is an old problem. More than a century ago Robert Koch, a German physician and microbiologist, recognized that a mere statistical association between two factors was insufficient to warrant a conclusion that the factors were causally associated, and he formulated several principles for use in assessing the veracity of apparent relationships in particular cases. His principles were formulated to facilitate evaluation of the role of microbes in diseases, because the environmental factors that were of interest to him were infectious agents.
Koch's general notion was that any claim that a particular microbial agent was responsible for a particular disease required that four criteria be satisfied. First, that the microbe occurs in every case of the disease. Second, that the microbe doesn't occur in other diseases. Third, that the microbe doesn't occur where there is no disease. Fourth, that the microbe can be isolated from a diseased subject, grown in culture, and used to induce the disease in a non-diseased subject.
Koch's criteria have proved durable and useful, but they are applicable only to infectious agents and they are insensitive. If the criteria are satisfied it can confidently be concluded that the microbial agent caused the disease, but the cause of the disease is left unresolved if the criteria are not satisfied.
In 1965, Austin Bradford Hill (1897-1991), an English medical statistician, published a set of criteria (Hill's criteria) that he suggested could serve to help evaluate the causal role of any environmental factor. The criteria first appeared 11 years earlier in a little-known paper whose author listed them in an attempt to explain why he concluded that smoking and cancer were causally related. Essentially the same criteria appeared again in 1964, and for the same reason, in the famous Surgeon General's report linking smoking and cancer. Hill paraphrased those criteria in what the famous epidemiologist Abraham Lilienfeld considered to be more elegant language, and the criteria subsequently became best known as Hill's criteria.
Hill's first criterion involved the magnitude of the statistical association between an environmental factor and a disease, which is typically measured in epidemiological studies by the relative risk or odds ratio. Hill assumed, without any explicit justification, that a higher relative risk would imply more confidence in the causal role of the factor. It is difficult to see why this should be the case because the existence of a cause-effect relationship and the magnitude of the effect are independent concepts. Furthermore, observed statistical associations are affected by both the causal relationship and the presence of non-causal factors that introduce variance into a study. A low relative risk would be consistent with a high relative risk in the context of variance-inducing conditions, and with a true low relative risk in the case in which the variance was low.
Hill was obviously impressed by the high risks found in classic epidemiological cases including a risk of 200 for scrotal cancer in chimney sweeps, 30 for lung cancer in smokers, 14 for death in the cholera epidemic of 1854 among customers supplied by the Southwark and Vauxhall Water Companies. Hill confused the concept of public-health significance, which is related to the magnitude of the effect, with the idea of causality which is not. It is not logical to regard the magnitude of the relative risk in an epidemiological study as probative of the existence of a cause-effect relationship.
Hill's second factor was consistency of association. The idea was that if the same or similar observations were made in studies by different investigators in different places at different times under different circumstances, the inference that the factor and the disease were causally related would be proportionately strengthened. No one can seriously quarrel with this idea in the case where consistency is observed. The real question, however, is what interpretation should be given to apparently inconsistent studies such as the EMF epidemiological studies? The criterion of consistency of association cannot logically be accepted as necessary because it is entirely possible that a sought-after statistical association performed by different persons in different places and times under different circumstances should yield inconsistent results because there could be true causal associations in some of the studies but not in others. The criterion is therefore no help at all in evaluating the EMF epidemiological literature.
Hill's third epidemiologic criterion for causal association was specificity of association, but even Hill recognized that this criterion was insignificant because there are essentially no instances of specific relationships between environmental factors and particular diseases since diseases may have more than one cause. Hill consequently conceded that specificity of association was only a sufficient not necessary factor in judging the existence of true cause-effect relationships.
Hill's fourth criterion was temporality, by which he meant that a factor can not properly be regarded as a cause if it comes after the effect. The criterion, however, is trivial because it is part of the definition of effect.
Hill's fifth criterion was an assumption - the now familiar assumption of linearity. He argued that if more of a putative cause produced more of the effect, then one could have greater confidence in the reality of the cause-effect relationship. Again, as with the third criterion, we have a listing of a sufficient but not necessary factor.
Hill's sixth criterion was plausibility, but he never explained what he meant by that term. At least three possible meanings of plausible can be identified on the basis of the way the term is used generally. Plausible can mean that a mechanism can be suggested to account for a particular observation. For example, an observation that addition of a signaling agent to a group of cells causes the cells to make proteins can be viewed as plausible because a putative mechanism, namely interaction of the signaling agent with membrane-bound receptors leading to initiation of a second-messenger system, can be postulated. On the basis of this meaning of plausible, the link between powerline EMFs and cancer is plausible because the occurrence of a stressor reaction mediated by serum corticoids, leading to impaired immunosurveillance and increased risk of cancer, can be postulated.
Plausible can also mean that a mechanism can be suggested and evidence for the mechanism can be provided. This definition would be met if the membrane receptor in the example above was identified and shown to initiate a particular sequence of intercellular changes following interaction with its ligand. The link between powerline EMFs and cancer probably meets this definition of plausible because there exists evidence showing that EMFs can affect serum corticoids, immune parameters, and central nervous system activity.
Plausible can also mean that the mechanism of action linking the cause and effect must be supported by an extensive amount of evidence such that it can be concluded that the mechanism has been proved. Such would be the case in the example above, for example, if all the intermediary steps following the ligand-receptor interaction were specifically identified up to and including the mechanisms that resulted in secretion of the newly synthesized proteins. The link between EMFs and cancer cannot meet this definition of plausible.
Thus plausible can become (and has become in the case of EMF studies) a code word indicating one's general attitude, rather than a concept that is useful in arriving at an attitude. In its general effect, the criterion creates a bias against novel ideas. For example, Semmelweiss' exhortation that Viennese medical students should wash their hands after dissecting cadavers prior to examining women on the maternity wards as a means of avoiding childbed fever was implausible, coming as it did prior to the work of Lister and Pasteur. Only after recognition of the germ theory and the development of antisepsis were any of the plausibility criteria satisfied.
Hill invoked a seventh criterion he called coherence, which was actually a degree of his plausibility criterion. Semmelweiss' theory, for example, was not plausible but it would have been extremely implausible if Semmelweiss' peers had already accepted the view that microbes did not cause disease. A cause-effect relationship is coherent, according to Hill, if it does not contradict established fact. Hill gave no examples of the operation of the coherence criterion, and its value as an independent consideration in evaluating EMF epidemiological studies seems dubious.
Hill's eighth criterion involved experimental manipulation. If a statistical association between an environmental factor and a disease is observed, and, all other things being equal, one repeated the study but removed the environmental factor, would the occurrence of disease be altered? This, of course, is the classic definition of the method of experimental biology and it is the proper one to show the existence of a cause-effect relationship. But such a study is not what is ordinarily meant by an epidemiological study.
Hill's last criterion was analogy. Given that thalidomide causes birth defects, he said that we can accept less evidence that another drug could cause the same outcome. There seems to be no logical basis for this criterion and, insofar as I can tell, it has not been used by others to judge epidemiological data.
Thus, Hill's criteria are no help at all in evaluating the EMF epidemiological literature. They have been employed to describe opinions about the public-health significance of EMF epidemiological studies, but there is no case where Hill's criteria were used to justify or explain an opinion regarding the significance of the EMF epidemiological studies.
The EMF epidemiological studies have the surpassingly great benefit of providing information about the actual object of interest - human beings - rather than laboratory animals. However, epidemiological studies have significant inferential limitations that arise, ultimately, from the way the studies were performed. Epidemiologists can't do randomized, controlled studies to evaluate the impact of powerline EMFs on human health. This fundamental distinction from the way human clinical studies are done and from the way laboratory experiments are conducted, combined with cost factors and with the relaxed standards for experimental design that have been accepted by epidemiological journals, results in uncertainty that requires adoption of decisional rules capable of investing epidemiological data with meaning. Standing alone, the EMF epidemiological data has no meaning.
What is needed is an evaluation of the methods and procedures of EMF epidemiology, irrespective of the results in particular studies, and a determination whether the data from such studies will be deemed acceptable for forming judgments regarding whether powerline EMFs affect human health. Further, if the data is acceptable, the method whereby the inferences will be drawn must be specified. It is possible, for example, that a fair committee of EMF experts might conclude (and justify) that no conceivable results of EMF epidemiological studies are worth considering. Any such conclusion regarding the EMF epidemiological studies would require examples of epidemiological studies that the committee would consider applicable to the problem of evaluating cause-effect relationships involving environmental factors. Then, future studies could be scrutinized to ascertain whether they contained the needed elements that were missing from the earlier studies. The scientific validity of the decision would be guaranteed because of the process by which the committee was chosen and by which it functioned.
As discussed in the previous Section, it seems quite reasonable to expect that scientists will decide scientific questions, and laymen will decide how scientific data is to be used in forming public policy. Conceptually at least, the two decisional levels are discrete. In contrast, with epidemiological studies, there is no such separation. The scientific and public-health considerations are inextricably commingled when epidemiological data is evaluated. For this reason, I think it would be inappropriate to attempt to evaluate the EMF epidemiological data with regard to the issue whether powerline EMFs affect human health via a process that was restricted to scientists only.
Those charged with defining the requisite criteria should approach their task on a limited pragmatic basis, and not attempt to devise criteria for guiding all disputes and inquiries. Koch, for example, in formulating his criteria, dealt with a particular problem, namely infectious agents. Similarly, the experts responsible for the Surgeon General's report formulated criteria aimed at helping to resolve a particular problem, namely the link between smoking and lung cancer. In both instances, the authors explicitly recognized that the proposed solution related to a particular problem, and did not necessarily encapsulate a philosophical approach applicable to all problems in scientific reasoning. It is possible, of course, that reasoning principles elucidated as an explanation and justification for why and how the EMF epidemiological literature should be viewed will be relevant to other potential epidemiological issues, but that possibility remains to be determined, case by case.
Only when decisional criteria are established will it be possible to cut the present Gordian knot of controversy regarding the epidemiological significance of powerline EMF studies. Personally, for two reasons, I am persuaded that the EMF epidemiological studies show that powerline EMFs can affect human health. First, and most importantly, almost every study conducted has yielded a relative risk greater than 1.0, and the existence of a true cause-effect relationship is the only rational explanation for this global pattern that I can see. Second, the result is plausible in both the first and second sense of that term, as defined above.
Marino Home Page | Research Interests
| 41412 |