Saturday, February 11, 2012

Too BOLD

This week in seminar we were presented with data from the Beef in an Optimal Lean Diet (BOLD) Study recently published this month in the American Journal of Clinical Nutrition (2012; 95: 9-16).

The BOLD study was generously brought to us by our friends at the Beef Checkoff Programwho's efforts include "working to continue growth in beef demand" in part by developing "informational and promotional projects ... based on research relating to nutritional value of beef and beef products". I'm confident that this research funding as well as the travel grants and honoraria that came with it had no influence on the design, implementation, interpretation and presentation of the study, but lets take a quick look at the methodology anyway, just in case.

The author's claim in their discussion that their "study population was representative of a large portion of the US population ... and thus, the findings have broad applicability". According to the methods, they recruited healthy men and women that were 30 to 65 years old that had elevated LDL-cholesterol (bad cholesterol). They had a few other restrictions worth noting:

Inclusion Criteria

Exclusion Criteria

Body Mass Index 18.5-37 kg/m2

Triglycerides <3.95 mmol/L

Blood pressure <140/90 mmHg

Nonsmoking

Cardiovascular, liver, kidney, and autoimmune diseases, or diabetes

Taking cholesterol or lipid lowering medications or supplements

Pregnant or lactating

Recent weight loss (>10%/6mo)

Vegetarian


Of the 968 people who responded to the advertising for the study, 42 of them met the criteria and were enrolled - that's 4.3% of the subset of the population that would respond to such a study - aka, a representative sample. This also overlooks the 17 to 21% non-completion rate in the study.

The BOLD study employed a particularly interesting design known as crossover. In a crossover study, subjects are exposed to each of the study treatments. This approach has two distinct advantages:
  1. It increases the sample size because each participant contributes data to more than one exposure groups (as in a randomized trial)
  2. It emulates the counterfactual ideal by comparing people with themselves
The counterfactual ideal is a theoretical model of the perfect study design for determining causality. In the counterfactual ideal, the exposure of interest would be compared to the non-exposure in the same group of individuals, over the same time period (like a do-over). Because each person would be re-living that same time period over again, all of the factors that might affect that results are controlled for, and any differences in outcome would be solely due to the change in exposure. This is obviously not possible, but the crossover design mimics this scenario by having subjects serve as their own comparison between exposure groups. It is important to recognize that it does so imperfectly - external factors will be different, and there is a possibility that the first exposure may have an effect on subsequent exposures in predictable or unforeseen ways. Two commonly known order effects are:
  1. Learning - if the outcome is something that can be learned, the results will likely be better the second time around
  2. Carryover - the effect of the first exposure on the outcome of interest "carries over" to the second exposure, so that the starting point of the subsequent exposures is not the same
It is possible to limit the order effects by equally and randomly allocating subjects to start with different exposures, and by having a grace (washout) period between the exposures to allow the subjects to return to a baseline state. When done well, a crossover study is one of the better designs for assessing causality, particularly when the source population for recruitment is small.

The BOLD study examined four different feeding regimens of 5-weeks duration, and used a washout period of 1-week between feeding regimens:
  1. Healthy American Diet (HAD)
  2. Dietary Approaches to Stop Hypertension (DASH)
  3. Diet Beef in an Optimal Lean Diet (BOLD)
  4. Beef in an Optimal Lean Diet Plus Protein (BOLD+)

The above diagram depicts what participation in this study would look like for a subject.

Remember, the washout period in a crossover study is used to return subjects to a baseline-like state (to mimic the counterfactual ideal) before beginning the next treatment. This reduces the chances of a carryover effect from one treatment to the next. Given the reductions in LDL-cholesterol from 3.6 to ~3.22-3.44 mmol/L that were observed in this study, I find it highly unlikely that carryover did not occur with a washout period of only 1-week. It makes intuitive sense that the effect size of a lipid lowering diet plan will be proportional to the baseline LDL-cholesterol concentration (it is harder to lower someone's cholesterol if it is already in the normal range). The authors' provide no rationale for the brief washout period, and do not comment on, or provide data regarding the carryover effect. If carryover occurred, and there is no way of knowing if it did, then the order of the dietary treatments would have been very important.

Thankfully, subjects were randomly assigned the order in which they were placed on the four feeding regimens. Simple randomization ensures that there is an equal opportunity of a feeding regimen being 1st, 2nd, 3rd or 4th in this study for each subject. However, it does not guarantee that the groups are the same at the beginning of each of the feeding regimen. This is where "Table 1" comes in - the first table in almost any published article provides the baseline characteristics of the subjects. In studies where groups are being compared with one another, particularly if there is a possibility of bias, it is convention and good scientific practice to display the group baseline characteristics separately for the purpose of comparison. This transparency provides other researchers with piece of mind (not a guarantee) that the groups are similar before treatment, indicating that the simple randomization was effective, and that any differences in outcome are due to the exposure of interest.

In the BOLD study, there were 36 subjects and 4 treatment groups. This is an very small sample size considering the primary outcome of interest, LDL-cholesterol, ranged from 2.46 to 4.84 mmol/L at baseline. Consequently, there is a good chance that the randomization to feeding regimen order resulted in a bias for one or more groups. However, instead of providing baseline characteristics according to feeding regimen, the subjects were grouped according to gender, so that we can compare males and females. Apparently the males had a higher BMI at baseline - I fail to see the use of this information in a small crossover study where each participant acts as their own control.

While we are on the topic of randomization and sample size, lets discuss power. Not the ability to influence others, but the likelihood of wrongful conclusions. Have you ever wondered how researchers determine how many subjects they should recruit? It is actually one of the more important considerations when planning a study - too many subjects is a waste of resources and places undue burden on participants, whereas too few subjects increases the probability of interpretation errors. There are two types of interpretation errors:
  1. You determine that the exposure had an effect on the outcome of interest, when it actually doesn't (called an "alpha" error)
  2. You determine that the exposure didn't have an effect on the outcome of interest, when it actually does (called a "beta" error)
Both errors can have serious implications, but we are generally more conservative when it comes to alpha errors - meaning we would rather miss an actual effect than say one exists where it doesn't. There is still the prospect of chance to take into consideration - as a convention, we generally permit a 5% chance of making an alpha error, and a 20% chance of making a beta error, and this dictates the sample size needed. It makes sense that as a study increases in sample size from 2 subjects per treatment group to 30 subject per treatment group that the probability of an error occurring by chance alone decreases significantly. When sample sizes aren't large enough to prevent interpretation errors, we say that they are "underpowered", and of the two, a beta error is most likely to occur. In fact, the "power" of a study is defined as simply the opposite of the chance of a beta error (ie. a 0.2 probability of beta error = a power of 0.8).

So how did the BOLD study do? Well, they did a power analysis - check!
"Analysis used the following assumptions: power was set at 0.8, alpha was set at 0.05, and 2-tailed tests were used. It was estimated that a sample size of 40 was sufficient to test the primary LDL-cholesterol hypothesis while allowing for a 10% dropout rate"
As per the convention, there was a 20% chance of beta errors and 5% chance of alpha errors with a sample size of 36. However, due to a higher than anticipated dropout rate (21%), only 33 people completed all four feeding regimens (I think). Oddly enough, some feeding regimens had reported sample sizes of 34 and 35. Apparently, they weren't going to let a little thing like dropout stop them from including the data from people who completed only some of the feeding regimens and not others - so much for the counterfactual ideal.

Before leaving this topic, I would like to take a moment to mention the dataset that they used for their power analysis, the original DASH trial. Because the findings in the DASH trial were used to calculate sample size in the BOLD study, it is essential that they comparable. There are several differences between the two studies, including the fact that the DASH trial was an 8-week dietary intervention, not 5-weeks. I fully acknowledge that there is rarely good pilot data available for determining sample size, and researchers need to guesstimate. In these circumstances, it is generally preferred to error on the side of too many people.

The authors' conclude that:
"The results of the BOLD study provide convincing evidence that lean beef can be included in a heart-healthy diet that meets current dietary recommendations and reduced [cardiovascular disease]"
The statistical tests used in this study are designed to detect significant difference between groups, allowing for a 5% chance of making an alpha error. The ability to claim that two groups were similar on the other hand, as was done here, is a reflection of power. If the BOLD study had the intended a sample size of 36 (which it didn't), an adequate washout period (which it didn't), and a similar design as the DASH study (which it didn't), the probability that their findings could have occurred by chance were still 1 in 5 - apparently that is "convincing evidence". It makes you wonder if they welcomed the possibility of beta errors, and the justification to suggest that lean meats are "clinically-proven" to be part of a "heart-healthy diet".

These design flaws are somewhat unfortunate because Penn State University has a Metabolic Diet Study Center, allowing them to prepare all of the meals from scratch, which provides excellent control over the dietary exposure of interest. Moreover, subject compliance with the prescribed diets was reported to be 93%, which, although unconfirmed, is quite good.

The feeding regimens themselves left something to be desired. The so-called "Healthy American Diet" (HAD), which served as a control group in this study, wasn't so much "healthy" as it was typical - high in saturated fat and cholesterol, and low in fiber - these factors are all associated with higher LDL-cholesterol. This is a common research approach, used to ensure that the findings will be positive (see my previous post "Supply, Meet Demand - The Future of Food Science" for a more detailed description of this concept). The DASH diet feeding regimen in the BOLD study was considerably better than the HAD, but was not consistent with the spirit or the letter of the true DASH diet, as shown below:

Dietary Component

Original DASH Study

BOLD Study

Fruits

5.2

4.1

Vegetables

4.4

4.3

Grains

7.5

4.5

Low-Fat Dairy

2.0

2.3

Regular-Fat Dairy

0.7

0.1

Nuts, Seeds and Legumes

0.7

2.1

Beef, Pork and Ham

0.5

1.0

Poultry and Fish

1.1

3.7

Fats and Oils

2.5

4.0


The table above compares the different DASH diets that were used in the original DASH diets used to determine sample size, and the BOLD study DASH diet. As can be seen, the BOLD study DASH diet is lower in fruits, grains and regular-fat dairy, and higher nuts, seeds and legumes, meat, and fat. The DASH diet is meant to be a plant-based diet, yet there is an average daily provision of 4.7 oz of meat. This importance of this issue becomes clearer as we look at a comparison of the DASH diet and BOLD diet feeding regimens in the BOLD study:

Dietary Component

DASH Diet

BOLD Diet

Fruits

4.1

4.5

Vegetables

4.33.9

Grains

4.55.6

Low-Fat Dairy

2.3

1.8

Regular-Fat Dairy

0.1

0.0

Nuts, Seeds and Legumes

2.11.3

Beef

1.04.0

Poultry, Pork and Fish

3.71.0

Fats and Oils

4.0

4.3


The difference in servings of meat was 4.7 oz/d (DASH) and 5.0 oz/d (BOLD). So, what this essentially compared was two relatively healthy diets, one with mostly poultry, pork and fish, and one with lean beef. And when I say lean, I'm talking 95% lean ground beef, which is leaner than the extra lean ground beef that I purchase at my grocery store.

This brings me to my final complaint about this study, which is the implications, the message that we be sent, and how it will be received by the public. I understand that the beef industry is feeling unfairly prosecuted for its excessive use of energy, land and water, its massive production of waste and greenhouse gases, its contribution to antibiotic resistant bacteria, its unethical treatment of animals, and for potentially causing cardiovascular disease, cancer and death in humans. Even if all of these things are true, in our capitalistic, individualistic society, they have as right, nay a responsibility (to their stakeholders), to do what they can to get people to eat more beef.

It's a relatively straightforward process:
  1. Finance research that can cast doubt about the "eat less beef" messages by creating scape goats and caveats so that they can externalize the responsibility onto you, the consumer. The BOLD study attributes heart disease risk from eating beef on saturated fat - check!
  2. Create oversimplified, misleading messaging, and distribute it as widely as you can. Consumption of lean beef is an important part of a heart-healthy diet - check!
This creates a distinction in the minds of consumers between their product, beef, and the cause of heart disease, saturated fat. Now, it is up to the consumers to choose the beef cuts that are lower in saturated fat - it's another form of individual responsibility (or person blaming). Assuming that this was the case, that it is the saturated fat in beef, not beef itself, that contributes to heart disease. Beef is still a major source of saturated fat in our diet. Telling people to consume lean cuts of beef is shortsighted, weak health messaging that serve industry interests rather than the public. It is not as if they take off the lean cuts and throw out the rest - it all makes its way through the food system and into our diets.

There were many design flaws in the BOLD study, and the AJCN should be embarrassed to have done such a poor job editing it (Table 1, aggregate information, uneven numbers in a crossover study, unwarranted conclusions, etc.). However, more concerning is the messages that are sure to pour out from the beef industry - they definitely got their money's worth.

Wednesday, January 4, 2012

Supply, Meet Demand - The Future of Science?

The scientific method is meant to be a circle, a continuous cycle of:


It's an imperfect system, and there are many issues, including:
  • flaws in study design (ie. small sample size)
  • data mining / cleaning / fabrication
  • unwarranted conclusions
  • the failure to publish all details / results
  • the failure to publish negative findings
  • inappropriate citing of previous research
Many of these issues could be addressed by careful editing of article submissions, a topic that I plan to cover in a future post. For now, I want to discuss the influence of the food industry in nutrition research. The inspiration for this post came from a colleague's review of a study by Bassaganya-Riera et al. on punicic acid, the main fatty acid (or fat) found in the arils (or seeds) of pomegranates .

I'll begin with the bad news - nutritionists haven't been entirely honest with you. We often group nutrients together based on physiochemical properties with full knowledge that they include a broad range of compounds that are not therapeutically equivalent. However, consumers cannot be expected to differentiate between beta-glucans and fructo-oligosaccharides (both soluble dietary fibers), or alpha-linolenic acid and docosahexaenoic acid (both omega-3 fatty acids), so we just let them believe they are all the same - rest assured, they aren't.



With this in mind, lets discuss punicic acid (depicted above). As you can see, it is 18 carbons long, with 3 sites of unsaturation (double bonds). It makes up the majority of the fatty acids (~75%) in pomegranate seed oil, and is being investigated in this study for "anti-inflammatory" properties for the prevention of inflammatory bowel disease (IBD). I have circled the double bond that is in trans orientation, and is responsible for its designation as a trans fat.

Yes, it belongs to the family of dreaded "trans fats" that are being banned in supermarkets and restaurants (and with good reason). But this is a different trans fat - one that is found "naturally" in the food. Both milk and meat contain another group of "natural" trans fats, collectively known as conjugated linoleic acid (CLA), which are coincidentally also believed to have health-enhancing properties. The more commonly known trans fats that are linked with several risk factors for coronary heart disease are actually a subset of trans fatty acids that are man-made, usually through a chemical process called hydrogenation. But, I digress.

The study randomized mice to two separate diets, which differed only in fat composition - fat provided about 15% of total calories, 86% of which came from soybean oil. The remaining 14% was either linoleic acid (control group) or pomegranate oil (treatment group). This is the foundation of an experimental study, the assignment of the exposure of interest while attempting to keep all other variables similar between groups. When this is done right, any differences can be assumed to have been "caused" by the exposure.

According to the authors, "the optimal doses of PUA included in these diets were the result of time course and dose titration studies designed to elucidate the optimal anti-inflammatory efficacy of PUA performed previously (data not shown)". I'm admittedly not loving the lack of transparency in that process - moving on.

Interesting that the "optimal dose" was 14% of total fat intake - lets translate:
If a person consumes 2,000 calories per day, and 30% of my calories comes from fat (this is a low fat diet by Canadian standards), then I am consuming 600 calories as fat. There are 9 calories per 1 gram of fat, so I would be consuming ~67 grams of fat. If 14% of this fat was from pomegranate oil, then I would need to consume roughly 9 grams of pomegranate oil every day. This amount could quite conceivably be consumed in a supplement, although you would likely need to take 6-9 of them (1-1.5 grams each) for them to be swallowable by a human.

Equally important in this process is the choice of reference group for comparison. There are many options available, and rarely is there is a clear best choice when it comes to nutrition. In the case of inflammation, certain fats are better than others. Arguably, the most well-studied fat is the pro-inflammatory omega-6 fatty acid, linoleic acid, which is also the major fatty acid in soybean oil. Anyone who studies nutrition knows this, begging the question, why would you design a study that uses soybean oil as your background diet and linoleic acid as your comparison group? The answer is simple - effect size.

Statistical tests that look for differences between groups take into consideration three things: a) variance, b) sample size, and c) effect size. Effect size is the absolute difference between the two groups. So, if I really wanted to demonstrate that my treatment was effective, I would compare it to the worst possible substance that I could reasonably get away with, thus maximizing my effect size. Then, I could advertise my product as being "clinically-proven".

The author's of this study concluded that:
"these data indicate that PUA ameliorates experimental IBD by down-modulating inflammation in mucosal immune and epithelial cells"
In reality, all that this study demonstrates is that, in this mouse model fed this background diet, providing 14% of the fat as punicic acid reduces certain inflammatory markers / outcomes compared to providing 14% of fat as linoleic acid. Unfortunately, no where in the results or discussion do the authors acknowledge the pro-inflammatory effects of linoleic acid - perhaps they assume that the readers are aware of this fact.

This study was funded by Lipid Nutrition, a company that produces and sell fatty acid supplements. Moreover, the primary author has filed a patent related to punicic acid. Although this information was provided (in small print at the end of the paper), given the obvious conflicts of interest, I doubt that objectivity could be maintained here - perhaps explaining the limitations discussed. Negative results would have been very hard to publish.

Why was this experiment conducted in the first place? It is possible that punicic acid is just a really promising fat, and that the good people at Lipid Nutrition believe that it may be useful for disease treatment and/or prevention. However, at the risk of sounding cynical, I think that the real reason is far less scientific and selfless. Pomegranates have recently become recognized as a "super food" thanks to their high anti-oxidant levels. I feel that it has gotten to a point where I could add a few drops of pomegranate juice to iced tea, and sell it for a dollar more as a health drink.

While we are on the topic, there are no "super foods" - it's a genuinely stupid idea that people really need to get out of their minds. All real foods are super, and with adequate funding and time, I am confident that researchers could extract hundreds of different nutrients from any real food, and design a study that would demonstrate some health-enhancing property. I use the term real food intentionally to differentiate it from the other edible foodstuff that can be purchased at the supermarket.

Back to the matter at hand. The pomegranate juice and the seeds are kind of a package deal, so as the demand for juice increases, there is going to be a surplus of cheap seeds. This creates an opportunity for companies that are able to find a use for them. Turning them into expensive fatty acid supplements is almost poetic, and I would commend their efforts if they weren't so unnecessary and underhanded. The consumers that buy pomegranate juice for its anti-oxidants are likely the same ones that would purchase punicic acid supplements for its anti-inflammatory properties. If only there were a single product that had it all for less.

It is not surprising that the formation of juice would result in the loss of important nutrients, and lends further support to the notion that you should not drink your fruits and vegetables. Although I must admit that it is far more profitable for food manufacturers to sell plants in their individual bioactive components, I'm not convinced that this is in any way better for the otherwise healthy consumer. As for nutrition as a science, I imagine that it will continue to go where the money is, providing the evidence required to create the demand needed by the food industry.

Bassaganya-Riera J, DiGuardo M, Climent M, et al. Activation of PPARg and d by dietary punicic acid ameliorates intestinal inflammation in mice. Br J Nutr 2011; 106: 878-86.