Draft conclusions and recommendations
1. Scientific value of rat feeding studies with whole food/feed for GMO risk assessment
1.1. The G-TwYST project provided a broad set of data indicating that the performance of rat feeding trials with whole food/feed for the risk assessment of a GM plant would not result in additional information pointing at possible health risks of the GM maize NK603 when compared to the earlier risk assessment published by EFSA (EFSA 2003). This approach comprises molecular characterisation of the genetic modification, phenotypic, agronomic and compositional analysis of the GM line in relation to its conventional counterpart and other non-GM lines (reference lines) as well as the evaluation of all identified intended and potential unintended differences with regard to possible adverse effects to human and animal health and the environment.
1.2. No potential risk for humans and animals was identified in the original assessment of the GM maize NK603. In particular, no triggers for animal feeding studies were identified from the molecular characterization and the compositional, phenotypic and/or agronomic analyses of NK603. Therefore, the G-TwYST 90-day and long-term animal studies were conducted in the absence of a specific hypothesis. The G-TwYST data from 90-day and long-term rodent feeding studies did not identify potential risks as well, and therefore support the results from the initial analyses.
1.3. No potential risk has been identified in the course of the 90-day rat feeding study with NK603.
The G-TwYST data from the long-term feeding rat study with NK603 did not identify potential risks as well, and therefore support the results from the initial analyses and 90-day rat feeding
1.4. Three immune function assays (proliferative activity of lymphocytes upon mitogen and protein
stimulation, production of cytokines, and phagocytic activity and respiratory burst of leukocytes) not included in the OECD Test Guideline 408 for single substances were performed in the course of the two 90-day rat feeding trials with the GM maize NK603. The GM maize NK603 did not affect the immune functions tested in both studies. Many of the measured variables had a low precision and, therefore, only very large differences can be detected with 80% power. Taken together, the above-mentioned analyses did not increase the scientific value of the 90-day rat feeding trials.
1.5. The necessity to perform a feeding trial with whole food/feed should be carefully evaluated
given the high number of animals needed (in the case of the G-TwYST project: 720, 172 and 268 in the combined c ronic toxicity/carcinogenicity study, the 90-day feeding trial with the 11 and 33% inclusion rates of the GM maize NK603 and the 90-day feeding trial with the 50% inclusion rate of the GM maize NK603, respectively).
2. Design, conduct, analysis and interpretation of rat feeding studies with whole food/feed for GMO risk assessment
2.1. General issues
2.1.1. The protocols outlined in the OECD Test Guidelines 408 for subchronic toxicity testing and 453 for a combined chronic toxicity/carcinogenicity testing in rodents (OECD 1998 and 2009) have been designed for the testing of chemicals and provide standard procedures to identify health hazards resulting from the repeated exposure to chemicals for 90 days, 1 year and 2 years. The OECD Test Guidelines recommend the use of at least 10 animals/sex and group for subchronic, toxicity testing, 20 animals/sex and group for chronic toxicity testing and at least 50 animals/sex and group for carcinogenicity testing.
Chemicals can be administered individually to rodents at doses several multiples higher than the amount of the chemicals to which humans are exposed in order to test whether they may lead to toxicity. Whole food/feed contains a mixture of constituents and can only be administered to rodents at rather limited levels in order to avoid a nutritional imbalance. Therefore, it is unlikely that substances present in small amounts and with a low toxic potential in whole food/feed will cause any observable effects in animal feeding trials (EFSA GMO Panel Working Group on Animal Feeding Trials 2008).
2.1.2. G-TwYST has shown that reference groups of animals from previous similar trials in the same experimental facility fed with non-GM plant material considered to be safe may be used to define a regular bandwidth for each endpoint. Traditional statistical tests applied in toxicological studies to find differences between test and control groups cannot make a distinction between biologically relevant and non-relevant effect sizes, and EFSA has recommended to put more emphasis on the use of confidence intervals (EFSA 2011b). Based on a confidence interval approach, equivalence tests can be used to show that a test group is within the bandwidth defined from historical data (‘proof of safety’).
An alternative approach is to define bandwidths using toxicologically defined quantifications of adverse effects. However, these are often not available for the measured endpoints in 4 animal studies. The advantage of using historical data to derive equivalence limits is that no toxicologically defined quantifications of adverse effects are needed, as the latter can be assumed to lie outside the equivalence region.
2.1.3. In the above-mentioned context, the non-GM data obtained in the course of the 90-day and 1-year rat feeding trials performed in the preceding GRACE project were used as historical control data for the equivalence testing in the G-TwYST project. G-TwYST proposed and applied a statistical method for equivalence testing that accounts for statistical uncertainties in both the current 90-day feeding trials and the chronic toxicity testing phase of the 2-year feeding trial as well as in the historical reference data (van der Voet et al. 2017, Goedhart and van der Voet 2017, 2018a, 2018b).
2.1.4. The analysis of the obtained data from the rat feeding trials performed in the course of the GTwYST project showed that non-significant equivalences and significant differences occurred in no more than about 5% of cases across 1,424 equivalence tests and 3,472 difference tests, which is the expected percentage for statistical tests using a 5% error level.
2.1.5. Criteria to evaluate the scientific quality of 90-day and extended feeding trials on whole food/feed derived from genetically modified plants have been proposed (Schmidt et al. 2016). These should be taken into account when evaluating a rodent feeding trial in the course of a pre-market approval procedure. Including an animal feeding study that does not (fully) comply with the proposed quality criteria in a risk assessment should be decided on a case-by-case basis.
2.1.6. If 90-day, 1-year and/or 2-year feeding studies on whole food/feed derived from GM plants are planned to be performed in the course of research projects not related to a pre-market approval procedure, these should be based on the corresponding OECD Guidelines for the testing of single chemicals and take into account EFSA recommendations as well as the quality criteria proposed by G-TwYST (Schmidt et al. 2016).
2.2. 90-day feeding trials
2.2.1. Two 90-day feeding trials with the GM maize NK603 at inclusion rates of 33% and 50%, respectively, in the diet were performed. Their study design was based on the OECD Test Guideline 408 for testing single chemicals (OECD 1998) and by taking into account the EFSA Guidance Document (EFSA 2011a) as well the EFSA Explanatory Statement on 90-day feeding trials with whole food/feed (EFSA 2014).
2.2.2. All diets used were isocaloric and isoproteic. Increasing the inclusion rate from 33% to 50%
maize, as was earlier suggested by EFSA (2014), while rebalancing the diet composition did not
show any indication of nutritional imbalance in the 90-day feeding trial.
2.2.3. G-TwYST evaluated all significant differences identified between the test groups and the
control group, considering equivalences, whether the effects were dose-related and/or
accompanied by changes in related parameters. It was concluded that there were no adverse
effects related to the administration of the GM maize NK603 cultivated with or without
2.2.4. G-TwYST performed a power analysis for quantitative findings in 90-day studies estimating
effect sizes that could be estimated with 80% power (Goedhart and van der Voet 2018c). For
variables in the list of determinations as advocated by the OECD Test Guideline 408 for testing 5 single chemicals (1998) and the EFSA Guidance Document (2011a), such as body/organ weights, haematology and clinical chemistry variables, a design with 8 cages per group in a 90-day study is appropriate for more than 80% of the quantitative variables if deviations of a 1.3-
fold (+30% or -23%) or more have to be detected with at least 80% power.
However, the number of quantitative variables as required to be measured by OECD (1998) is
40 to 50, and the within-group variability in these variables was found in the G-TwYST studies
to vary considerably, between 1% and 44% expressed as a coefficient of variation. The 10-20%
of the variables with a relatively high variability would require larger sample sizes to attain the
same power as for the other variables.
It is concluded that, apart from setting effect sizes of interest, a prior selection should be made
of those variables for which a sufficiently high power is needed, before a power analysis can
be helpful to set the sample sizes for a 90-day animal study in line with the OECD Test Guideline
408 (1998) and EFSA documents (EFSA 2011a, 2014). These considerations underline the
necessity to perform whole food/feed studies with clearly targeted hypotheses.
2.3. Combined chronic toxicity/carcinogenicity feeding trial
2.3.1. A combined chronic toxicity and carcinogenicity study with the GM maize NK603 at an inclusion rate of 33% in the diet was performed. Its study design was based on the OECD Test Guideline 453 for testing single chemicals (OECD 2009b) and by taking into account EFSA‘s Considerations on the applicability of the above-mentioned OECD Test Guideline to whole food/feed testing (EFSA 2013). No toxicologically relevant effects related to the GM maize NK603 or the GM maize NK603 treated with Roundup were observed.
2.3.2. G-TwYST performed a whole food/feed combined chronic toxicity/ carcinogenicity study with
50 animals/sex and group for the carcinogenicity phase, which is the minimal number
requested by the OECD Test Guidelines 451 (2009a) and 453 (2009b). A power analysis for
qualitative (yes/no) findings, such as deaths and histopathological results, shows that with
such a generally accepted design only large differences in single incidences can be detected
with at least 80% power (Goedhart and van der Voet 2018c).
Increasing the number of animals is of limited help, and therefore not a suggested direction
for future animal studies. For example, when the control incidence is 1%, a 9-fold rather than
a 16-fold increase can be detected with 100 instead of 50 animals/sex and group, and, when
the control incidence is 10%, a 2.4-fold rather than a 3.2-fold increase. These levels of changes
are often larger than what toxicologists would judge as minimal relevant values. The low
sensitivity is inherent to the qualitative character of the findings.
A more appropriate conclusion is therefore that the potential usefulness of the results of such
studies lies in the possibility to interpret patterns of findings rather than in an analysis of single
2.3.3. G-TwYST has confirmed that the variability of quantitative measurements in a combined
chronic toxicity/carcinogenicity study is in general higher for the measurements after 12 and
24 months as compared to 3 or 6 months (Goedhart and van der Voet 2018a, 2018d).
Therefore, the statistical power to detect a certain difference will be lower after 12 or 24
months compared to 3 or 6 months, and larger samples may be needed for GM plant risk
assessment based on 12 or 24 months data. In terms of clinical chemistry and haematology
measures, carried out at the two-year termination period, scientific opinion is split as to their
actual value since the concurrent appearance of age-related diseases, such as end stage kidney
and hormonal senescence, tends to confound the interpretation by increasing the variability
in serum parameters associated with such effects (Young et al 2011). For this reason, OECD TG
453 leaves the option of whether or not to carry out such measures to the discretion of the 6 study director (OECD 2009b). Similar end points show less variability at the 12-month termination point in time, but nevertheless do show increased variability on those taken at the 90-day point in time due to the early appearance of age-related disease in some animals even at this relatively short time period.
3. Compositional analysis of plant and diets in the course of a safety assessment
3.1. In the course of G-TwYST, standard and/or certified chemical analysis methods have been employed on a limited number of quality control samples to assess and monitor the quality of the maize kernels and diets with regards to nutrients, anti-nutrients as well as hazardous contaminations including mycotoxins and pesticides. Such chemical and compositional analyses allow:
- to assess potential adverse effects in humans following the consumption of the plant material, and to control feed quality in the course of a feeding study.
4. Stakeholder engagement, transparency, and data accessibility
4.1. Substantial efforts were made to ensure stakeholder engagement, transparency and data accessibility. These included:
- stakeholder engagement in both project plans and results;
- making available draft research plans and preliminary research results (all data produced for stakeholder scrutiny;
- a procedure for discussing, systematically considering and responding to all stakeholder comments as well as tracking how the comments were considered in the project;
- detailed documentation and transparency of all steps;
- open access publications, and an open access repository for raw data to be available following academic publication of the results.
4.2. The approach received high praising from the majority of stakeholder participants. Project
team members experienced stakeholder contributions as helpful to improve the project both
in terms of type of studies conducted, interpreting results, and clarity of plans and results.
Challenges experienced include the limited flexibility in EU-funded projects to accommodate
stakeholder suggestions resulting in modification of work plans and not enough time for indepth
4.3. These challenges and the considerable resources and efforts needed do not suggest this
approach to be used on a routine basis. Yet, in case of highly contested scientific-technical
issues and polarised views, the approach remains an interesting option to improve the quality
and social robustness of research - in particular in case of projects with less rigid working plans,
time schedules, and budget allocations.
To read the entire study, please visit the G-TwYST .