Justifying The Number Of Animals For Each Experiment
a,, bcdWilliam C. EckelmanMichael R. Kilbourn, John Joyal, Renée Labiris John F. , eValliant abMolecular Tracer LLC, Bethesda MD 20814, Department of Radiology, University of cMichigan, Ann Arbor, MI 48109, Biology Division, Molecular Insight Pharmaceuticals, dCambridge, MA, 02142, Department of Medicine, McMaster University, Hamilton, ON, eCanada LBM 3Z5, Chemistry and Medical Physics, McMaster University, Hamilton, ON,
Canada L8S 4M1
Quantitation of the amount of radioactivity in tissues, fluids and organs following the
administration of a radiolabeled compound to an animal is a standard technique for
studying biochemical pathways and physical processes and for determining gross
distribution and pharmacokinetics of drug candidates. The use of radiolabeled compounds
in preclinical studies is expanding thanks to the increased availability of phosphorimagers,
small animal PET and SPECT cameras, and the unparalled sensitivity offered by nuclear
based techniques. It is important to note, however, that to produce valuable data from any
biodistribution study, be it ex vivo tissue counting or non-invasive imaging, requires careful
attention to experimental design particular with respect to the number of data points (i.e.
number of animals per time point). Simply choosing n= 3 as is commonly done in the
literature is not appropriate and can result in misleading conclusions. This issue and other experimental considerations are discussed further here with the aim of helping researchers
design preclinical biodistribution studies that will produce statistically significant data.
The ability of an in vivo study designed to measure statistical differences between control and either diseased or treatment groups is a function of the variability in the data and the
magnitude of the difference to be expected. In a significant number of publications, only
three animals per time point are used, but statistics dictate that in most situations a
minimum of 5 animals at each time point are required whether performing either tissue
biodistribution or autoradiography. The rational approach to choosing a sample size is to
weigh the benefits one can gain in information against the cost of increasing the sample
size. The benefit of additional information is not easy to estimate even in applied research
and it is extraordinarily difficult to estimate in basic research, such as the development of
new targeted radiopharmaceuticals. Therefore, it has been the practice of researchers to
establish target goals for the degree of statistical certainty (P<0.05 in most cases
1) or to
consider the degree of statistical certainty that might be achieved with various sample sizes,
and then to balance this with the cost of achieving that certainty.
To put the evolution of the targeted radiotracer in perspective, the first step of validation is
proving that the radiotracer binds to the protein target of interest. In the past, this was
accomplished by competitive binding studies with nonradioactive compounds performed
either in vitro or in vivo. There should be a correlation between the binding of the
radiotracer and the concentration of nonradioactive compound if a specific saturable site is
1 The P value is a probability, with a value ranging from zero to one. If the P value is small (P<0.05), the
difference is quite unlikely to be caused by random sampling, that is, the populations have different means.
present. The difficulty with these studies is that substantial amounts of non-radioactive compound necessary to compete in vivo may cause physiologic effects by either changing
blood flow or altering biochemistry [1]. For example, a decrease in blood flow caused by the non-radioactive compound may decrease the amount extracted and be misinterpreted as receptor competition. In a similar vein, the nonradioactive compound can be toxic at blocking concentrations and therefore alter the biodistribution of the radioactive compound. In these situations the maximal amount of nonradioactive compound that can be used for blockade might be too small to produce a significant difference between the non-treated group and the treated group.
In addition to studies based on pharmacological competition, genetically manipulated mice have been used to validate targeting of radiotracers. Gene manipulated mice, predominately knockout mice, have been especially useful in accelerating the target validation process so that imaging can play a vital role in the development of new chemical entities [2]. The use of knockout mice for validating radiotracers has been facilitated by the efforts of pharmaceutical companies in producing and characterizing these mice models [3] and [4]. With heterozygote and homozygote knockout mice the tissue concentration of the high specific activity radioligand is decreased to ~50% and non-specific binding concentrations, respectively. This does not yield the sensitivity [5] or identifiability [6] of the radioligand to free receptor density, but does give changes that are statistically robust. Recently, investigators have developed small interfering RNA (siRNA), which can be administered after the animal has developed normally and therefore eliminate any compensatory changes that may occur with genetic manipulation [7].
In order to choose the number of animals given that the goal is to use the minimal number based on ethical considerations and the cost of animals and housing (especially genetically manipulated rodents), software such as InStat [8] can be used to simulate the significance of the results. The following assumptions in the analysis of the data are reasonable based on experience in radiopharmaceutical research.
1. Unpaired data analysis if different sets of animals are used for each group. For animals
used as their own controls, e.g., unilateral lesions, control and target-expressing tumors in the same animal, paired statistics are appropriate. Likewise, for repeat studies in the same animal, paired statistics can be used. This is discussed for animal imaging below. 2. Gaussian distribution
3. Equal standard deviations in the control and treatment/disease groups 4. A two-tailed P value, which is the most conservative although the direction of the
change is often predictable.
Based on these assumptions, the goal is to determine the number of animals needed to achieve a significance of p <0.05 for various differences between the control group and the treatment/disease group with either 15% or 20% coefficient of variation (CV) for the group statistic as shown below.
Percent Difference
Between Control And Percent CV Due Number Significance
Treatment Group to Biological of P<0.05
Averages Variability Animals
20 20 2-7 Not significant
20 20 8 Significant
20 15 5 Significant
25 20 5 Barely Significant
30 20 5 Significant
25 15 5 Significant
This topic has been discusses previously for the more comprehensive power analysis [8].
Given this summary and the goal to measure small variations as a function of disease or
treatment, it is important to decrease the standard deviation within a group to a minimum.
There are several factors that can decrease the SD of the control group and the treatment
group. They are outlined below as a checklist for dissection studies:
1. The radiochemical purity should be >95%; the specific activity and the specific
concentration should be recorded at a given time such as EOS or time of injection.
2. The animals in the control group are the same age, strain, weight, and sex, and the
treatment group are matched to the same criteria at the time of the study. This is especially
important for gene-manipulated mice. For genetically manipulated animals, i.e., transgenic
or knock-out mice, the background strain should be used as the control.
3. The sacrifice order should alternate between control and treatment/disease groups
and be carried out on one day. Experiments spread over two days should contain both
control and treatment/disease animals on each day and performed at the same time of day .
4. Diet/feeding schedule of the animals should also be controlled.
5. The counting equipment linearity and volume dependence should be validated.
6. The anesthesia should not interfere with the biochemistry being studied. Animals
awake during the radioligand distribution period and animals under anesthesia during the
radioligand distribution period should be compared at a crucial time point to ascertain the
effect of anesthesia [9]. The type of anesthesia can also be changed after the biodistribution
reaches steady state to ascertain the effect of anesthesia.
7. The volume injected is calibrated by either measuring the weight of the syringe or
counting the radioactivity before and after the injection.
8. The injection site is assayed for retained radioactivity and this is subtracted from the
total counts. If the injection is made in a vein in the tail, the tail should be cut into sections
and analyzed as part of the organs. The data on the residual activity in the tail should be
reported.
9. The sample size is such that the weight of the tissue can be measured with a SD of
<5%.
10. The total injected counts should be diluted appropriately so that the radioligand
remains soluble and the counts per minute are in the linear range of the counting equipment.
This is important for H-3 labeled compounds using liquid scintillation counting, but also for compounds originally formulated in organic/ water mixtures.
11. Preset counts are preferable to preset time so that the count statistics are the same
for all samples.
12. Counting dead time should not exceed 20%.
13. All samples should be decay corrected to a common time, preferably the time
chosen for the specific activity and the specific concentration.
14. Samples containing the diluted total counts should be placed at the beginning,
middle, and end of the tissue samples and empty tubes should be placed between each sample to monitor for cross-talk between samples.
15. Data should be collected and analyzed using a statistical program.
16. If not using the entire organ to determine counts, use a standard sample size from a
specific portion of the organ for all animals.
17. If animals are under anaesthesia for a prolonged time period, body temperature
should be monitored and maintained.
Special Technical And Statistical Considerations In Small Animal Imaging.
The recent implementation of small animal imaging devices suitable for in vivo imaging of small rodents (mice, rats) has presented both advantages and complications for the field of radiopharmaceutical development. As a non-invasive procedure, imaging offers the ability to obtain pharmacokinetic data in potentially fewer animals, and to utilize animals in test-retest protocols and in longitudinal studies. These advantages may be particularly important when expensive genetically manipulated animals are being employed, as numbers of expensive animals might be minimized. The use of imaging techniques does not, however, necessarily reduce the numbers of animals needed to obtain a valid statistical result. Most of the factors enumerated above that impact on the variability of the data also apply to imaging studies, with the obvious substitution of imaging-specific requirements such as anesthesia (utilized except in rare cases), instrument calibration, and image analysis (placement of regions-of-interest). Few studies have been completed to directly compare (evaluation of respective variances of data, ability to discriminate between groups) imaging results with dissection or autoradiography studies. For determination of a pharmacokinetic curve for a new radiotracer, the use of an imaging technique where the complete curve can be obtained within a single subject is certainly advantageous as compared to a cross-sectional study using a minimal number of animals (3) over multiple time points (4-5 times, total of 12-15 animals), and coupling imaging with genetically altered animals offers a resource-effective means of validating radiotracer specificity [10]. Imaging methodologies are also advantageous when directly comparing one radiotracer to another, as full pharmacokinetic curves can be obtained for both radiotracers in a single test-retest protocol. The design of studies comparing groups of animals, and using imaging as the analytical method, still requires the use of an adequate number of subjects to meet statistical considerations based on the variance of the data. Where the variance in the imaging data is similar to that obtained in dissection studies (typically 10-15%), the numbers of animals needed is the same, but the costs of the imaging studies is certainly much higher given the initial investment in the imaging scanners and the greater complexity of the study [11]. The use of an animal as its own control for imaging studies, either in a test-retest setting on a single day or at later times, does utilize fewer animals and allows paired statistics and
thus does offer advantages as compared to dissection or autoradiography experimental
designs. For test-retest studies investigators do have to keep in mind the need for
maintaining similar physiologic characteristics and positioning of the animals between
scans.
References
[1] Shimoji K, Esaki T, Itoh Y, Ravasi L, Cook M, Jehle J, Jagoda EM, Kiesewetter DO,
Schmidt K, Sokoloff L, Eckelman WC. Inhibition of [18F]FP-TZTP binding by loading
doses of muscarinic agonists P-TZTP or FP-TZTP in vivo is not due to agonist-induced
reduction in cerebral blood flow. Synapse. 2003;50(2):151-63.
[2] Eckelman WC. The use of PET and knockout mice in the drug discovery process. Drug
Discov Today 2003;8(9):404 –10.
[3] Zambrowicz BP, Sands AT. Knockouts model the 100 best-selling drugs — will they
model the next 100? Nat Rev Drug Discov 2003;2:38– 51.
[4] Zambrowicz BP, Turner GA, Sands AT. Predicting drug efficacy: knockouts model
pipeline drugs of the pharmaceutical industry. Curr Opin Pharmacol 2003;3:1– 8.
[5] Eckelman WC. Sensitivity of new radiopharmaceuticals. Nucl Med Biol
1998;25(3):169–73.
[6] Vera DR, Krohn KA, Scheibe PO, Stadalnik RC. Identifiability analysis of an in vivo
receptor-binding radiopharmacokinetic system. IEEE Trans Biomed Eng 1985;32(5):312–
322.
[7] Chatterjee-Kishore M, Miller CP. Exploring the sounds of silence: RNAi-mediated
gene silencing for target identification and validation. Drug Discov Today.
2005;10(22):1559-65.
[8] GraphPad InStat version 3.0a for Macintosh, GraphPad Software, San Diego California
USA, www.graphpad.com".
[8] Hussey et al, ―Statistical Power Analysis of In Vivo Studies in rat Brain Using PET
Radiotracers‖, in Carson, Daube-Witherspoon and Herscovitch, Eds. Quantitative
Functional Brain Imaging with Positron Emission Tomography, Academic Press, 1998, pp
273-277.
[9] Tokugawa J, Ravasi L, Nakayama T, Lang L, Schmidt KC, Seidel J, Green MV,
Sokoloff L, Eckelman WC. Distribution of the 5-HT(1A) receptor antagonist
[ (18)F]FPWAY in blood and brain of the rat with and without isoflurane anesthesia. Eur J
Nucl Med Mol Imaging. 2006 Sep 22; [Epub ahead of print]
[10] Nabulsi NB, Smith DE, Kilbourn MR. [
11C]Glycylsarcosine: synthesis and in vivo evaluation as a PET tracer of PepT2 transporter function in kidney of PepT2 null
and wild-type mice. Bioorg Med Chem. 2005;13(8):2993-3001.
[11] Ma B, Sherman PS, Moskwa JE, Koeppe RA, Kilbourn MR. Sensitivity of
1111[C]N-methylpyrrolidinyl benzilate ([C]NMPYB) to endogenous acetylcholine: PET imaging vs. tissue sampling methods. Nucl Med Biol. 2004;31(4):393-7.