Introduction

The attempt to tailor the intensity of treatment to patient's risk of relapse, represents one of the major issues in the current therapeutic strategy of childhood acute lymphoblastic leukemia (ALL).1 Risk classification comprises the use of both clinical (age, liver and spleen size),23 and biological (leukocyte or blast count, phenotype, DNA index, chromosomal abnormalities) features at diagnosis34 and the evaluation of early response to pre-phase treatment.56

The medium-risk group (MRG) represents a rather heterogenous cohort of patients.17 In the recently closed national studies of the Austrian and German Berlin–Frankfurt–Münster (BFM) group (ALL-BFM 90),8 the Associazione Italiana di Ematologia e Oncologia Pediatrica (AIEOP-ALL 91)9 or the Dutch Childhood Leukemia Study Group (DCLSG, protocol ALL-8),10 the patients were stratified into standard-risk (SR), medium-risk (MR) and high-risk (HR) treatment groups, mainly according to the presenting features, ie leukemic cell mass and prednisone response.2 The medium-risk group covers more than half of newly diagnosed ALL and has 20–25% of children relapsing at 4–5 years from diagnosis.8910 Compared to high risk patients, the percentage of failures is relatively low, but they account for more than half of all relapses in most studies.8910 Identification of the children who are likely to relapse, should be sufficiently early to allow intensification of their treatment schedule and thereby improve their outcome.

We and others have recently shown that monitoring of minimal residual disease (MRD) by highly sensitive molecular or immunological approaches, gives clinically relevant insight into the effectiveness of treatment.1112131415 Combined information on MRD from the first 3 months of treatment identified patients at different risk of relapse and it has been proposed to be relevant for tailored treatment.11121314 The series of the International BFM Study Group (I-BFM-SG) on MRD represents the largest so far reported,14 and 65% of this series consisted of patients with medium-risk features, but the evaluation of the prognostic impact of MRD for these medium risk patients was hampered by the relatively small number of events and by the heterogeneity of the group with regard to MRD information.

This prompted us to design a matched case-control study, by focussing on the medium-risk patients already included in the I-BFM-SG MRD study,14 and by increasing the number of medium-risk relapsed cases analyzed for MRD, with the same modalities and in the same countries which participated in the previous study. This is an efficient type of study design, which provides an estimate of the impact of MRD on the odds of failure, and adjusts for heterogeneity in presenting features by matching.16 In addition, the study allowed us to explore different types of MRD classification and, under certain assumptions, to project the impact of early detection of MRD positivity in relapse-free interval.

Materials and methods

Patients and cell samples

Bone marrow samples were taken at diagnosis and during follow-up times (time-point 1 corresponds to the end of phase la of the induction, ie 5–6 weeks from diagnosis; time-point 2 before consolidation treatment, ie 3 months from diagnosis),8910 in 29 relapsing ALL patients and in the same number of matched controls (see below). Out of the 29 cases, 17 relapses in the MRG were already included in I-BFM-SG MRD series14 and 12 cases were additionally analyzed for MRD. All relapses occurred in the bone marrow. All controls were from the I-BFM-SG MRD series. All children were enrolled in the ALL-BFM 90, the AIEOP-ALL 91, or the DCLSG ALL-8 protocols8910 which shared the same BFM-based criteria for risk definition and intensive chemotherapy.2 We considered patients with B cell precursor phenotype and age greater than 1 year, who were classified as medium risk according to the following criteria: BFM risk factor (RF) 0.8 (and <1.7 only for AIEOP-ALL 91);2 good prednisone response (as defined if the peripheral blood blast cell count/μl at day 8 is <1000);2 CR at day 35 or 42; absence of t(9;22) and t(4;11) translocations and no CNS disease (only for AIEOP-ALL 91).9

Mononuclear cells were isolated from the bone marrow samples and stored in liquid nitrogen or at −70°C for DNA extraction. Of the 58 patients included in the analysis: 10 were from Austria, 13 from Germany, 28 from Italy and seven from the Netherlands. They fulfilled the following criteria: (1) preferably, two PCR targets at least one of which reached a sensitivity of 10−4; (2) MRD data known at the two predefined time points.14 In those cases (three), where MRD data only at time-point 2 were available, patients were included if MRD level was >10−4, as we assumed the same or higher MRD level at the previous time-point.

Study design and statistical analysis

This study is partially nested in the cohort of patients prospectively enrolled in the I-BFM-SG on MRD.14 The ‘cases’ in this study are patients who relapsed. Each case was matched to one control selected among patients of the same gender and country who had been in continuous complete remission (CCR) at least as long as the case and who had similar WBC count and age at diagnosis. According to sample size calculation, 25 matched pairs were needed to show, with 90% power and 0.05 type one error level, an odds ratio (OR) of 7.5, ie a 7.5-fold increase in the relapse rate in MRD positive patients with respect to the others (assuming a 10% proportion of MRD positivity in the control group). A total number of 29 cases were analyzed and for each of these, one matched control was found from the original study cohort (in three cases only the matching on sex was not possible). The odds ratio estimator, confidence limits and exact conditional test on the difference for matched case-control studies were calculated according to Breslow and Day.17 Secondary analyses contrasting MRD-high with MRD-intermediate and low-risk patients were performed for exploring purposes. Note that, in the case-control study, cases (relapses) are by design over represented as 50% of the patients, rather than the natural distribution of 21% observed in medium-risk patients in the prospective cohort of the I-BFM-SG study.14 In order to estimate the relapse-free interval (RFI) (defined as the time from complete remission to relapse; in case no relapsed occurred, the time is censored at the last follow-up or at death in remission), we resorted to the background hazard function as estimated, according to Cox,18 in the prospective cohort, and made the following assumptions: proportional hazards between MRD-based risk groups and the OR estimate as an approximation of the hazard ratio. Further, we should be aware in interpreting results, of possible biases that could have occurred in the cohort and in the case-control study, had the preserved samples been disproportionately available for different types of patients.

Identification of PCR targets at diagnosis

The procedure for the identification of the patient-specific probe according to the junctional regions of the T cell receptor (TCR) gamma (TCRG), delta (TCRD) and kappa deleting elements (Kde) recombinations has been described in detail.19 Briefly, the rearrangements were detected by Southern blot analysis and confirmed by PCR analysis and direct sequencing of the junctional regions with standardized sets of oligonucleotide primers. On the basis of the sequence data of the junctional regions, patient-specific oligonucleotides were designed for each identified MRD-PCR target, using OLIGO 5.0 software (National Biosciences, Plymouth, MN, USA).19

MRD detection during follow-up

The MRD-PCR analyses of bone marrow samples during follow-up were done by single PCR analysis of 1 μg of DNA (equivalent to 105–106 cells) with the standardized primer sets, followed by dot blotting and hybridization with the corresponding 32P-labelled patient-specific junctional region probe, as previously described.1920 The hybridization signals were visualized by use of radiographic films or phosphor-imaging. The sensitivity of each identified MRD-PCR target was established by use of a dilution experiment, in which DNA from the leukemic cells at diagnosis was 10-fold diluted into DNA control from a mixture of blood mononuclear cells of about 10 different healthy donors.1920 The concentration of leukemic cells in the bone marrow samples during follow-up was done by comparing the signals with those of the 10-fold dilution samples of DNA at diagnosis. This resulted in reproducible semi-quantitative estimations of MRD-PCR results of 10−2 or more, 10−3, 10−4 or less.

Results

Median WBC was 17 × 109/l among cases (range: 4–160 × 109/l) and 14 × 109/l among controls (range: 1–116 × 109/l). Median age at diagnosis was 44 and 42 months among cases and controls, respectively. Median time from diagnosis to relapse was 30 months in the 29 relapsing patients who constitute the cases; the median follow-up time of the patients in CCR who are the matched controls was 50 months. Table 1 summarizes the degree of MRD at time-points 1 and 2 in the cases (a) and control series (b). According to the MRD information at time-point 1 and 2, patients were classified in the low-risk MRD group when MRD negatively was present at both time points (group A), in the high-risk group when MRD was 10−3 at both time-points (group C), while all remaining patients were classified in the MRD-based intermediate-risk group (groups B, D, E). As shown in Table 1, MRD-based high risk patients were more frequent within cases than controls (14 vs 2) while MRD-based low risk was under-represented in cases as compared to controls (1 vs 18). The remaining patients (14 cases and nine controls) had MRD-based intermediate-risk features.14 The case-control results are shown in Table 2, both for the MRD-based high risk patients vs all other patients (a) and for the three MRD-based risk group (b). The MRD-based high risk patients experienced a significantly higher relapse rate than all others, according to the estimated seven-fold increase in the odds of failure (OR = 7.0, P = 0.01), and a much higher rate than patients with MRD-based low-risk features (OR = 35.7, P = 0.003). The odds of failure for patients with MRD 10−3 are higher than for MRD intermediate patients (groups B, D, E), but not significant (OR = 3.0, P = 0.18) (b). As most of the cases (24) had relapsed early, ie within 3 years from diagnosis, we also restricted the analysis to a more homogeneous subset in terms of outcome. The results on the 24 matched pairs, as shown in Table 3, are very similar to those obtained without time constraints in the design.

Table 1  Series of relapsing ALL and matched controls
Table 2  Estimated odds ratios (OR) according to different MRD levels
Table 3  Estimated odds ratios (OR) according to different MRD levels, accounting for cases relapsed within 3 years and matched controls

In order to project the relapse-free interval curve in the three strata defined by MRD, we combined results from the original prospective cohort,14 and from this case-control study. The background relapse rate was estimated on the original cohort,18 accounting for the natural mixture of patients and of relapses, while the OR estimates (Table 2) were used to discriminate the outcome between strata. The projected 4-year relapse-free interval was 44.7%, 76.4% and 97.7% for the MRD-based high-risk, intermediate-risk and low-risk groups, respectively. With caution due to the statistical assumptions underlying this calculation, these figures clarify the heterogeneity which is present in the medium risk patients (>1 year of age). Finally, we explored the implication of changing the cut-off point for the definition of MRD-based high risk, ie from 10−3 to 10−4 (data not shown). This change induced a slight decrease in the odds ratio estimates contrasting the three MRD categories, but, more importantly, a decrease in the specificity of the MRD test. The low sensitivity of the test, defined as in Table 2 (14/29 = 48%) is counterbalanced by a high specificity (27/29 = 93%). By lowering the cut-off points for the definition of positivity, the increase in sensitivity (up to 69%) is paid by a decrease in specificity to 77% (data not shown).

Discussion

This case-control study of MRD in childhood ALL demonstrates that careful molecular monitoring of in vivo treatment response might provide the tools to target more intensive therapy to medium-risk B cell precursor ALL at true risk of relapse. Although variably defined in different study groups, this group accounts for 50–60% of total ALL and comprises the largest number of relapses still unpredictable with currently available genetic or immunological markers.1 In the context of the I-BFM study on MRD,14 81 patients with B cell precursor ALL, over 1 year of age, and classified as medium risk according to clinical and biological features,8910 were found to be remarkably heterogeneous with respect to MRD levels. According to MRD degree at time-points 1 and 2,14 15% were classified as MRD-based high-risk, 47% as MRD-based low-risk and the remaining 38% as MRD-based intermediate-risk group. Overall they experienced 17 relapses, all but one in the MRD-based high (9) and MRD-based intermediate (7) risk groups. The limited number of events hampered the evaluation of the real impact of MRD detection to assess the risk of relapse within the medium-risk ALL subgroup.

We designed a matched case-control study, partially nested in the cohort: we used as much as possible the data already available,14 and the study was enriched with new cases in order to reach the target sample size. The analysis performed on 29 matched case-control pairs, clearly confirms that high MRD levels (10−3) represent a strong prognostic factor, being associated with a seven-fold increase (Cl: 1.96–45.45) and a 35-fold increase (Cl: 5.32–1000.0) in the rate of relapse when compared to all other patients (MRD-based low or intermediate) and only to those with MRD-based low risk, respectively. MRD-based intermediate risk patients showed a failure rate not significantly different from the MRD-based high risk, but still significantly different from the MRD-based low risk patients (OR = 12.3, Cl: 2.3–38.2).

Interestingly, five relapses occurred more than 36 months after diagnosis (at 39, 43, 45, 46, 50 months). In two of the cases, the levels of MRD at both time-points 1 and 2 were 10−3 (MRD-based high risk), thus being potentially eligible for a more intensive treatment. The remaining three cases, were classified as intermediate-risk and accordingly leave open the question of the MRD strength as measured at early time-points of treatment, to identify patients at risk for late or very late relapses.

These data confirm and further extend the strong predictivity of MRD detection as an independent prognostic factor in childhood ALL1112131415 in particular for those patients considered at low risk of relapse.101112131415 They pertain to a very large ALL subgroup, with B cell precursor phenotype, age greater than 1, with early good response to treatment (as assessed by in vivo prednisone-response) and thus not likely to be further classified for the risk of relapse according to standard clinical features. Several known biological features could be additionally considered to further stratify risk classification for medium B-lineage ALL.2122232425 In the perspective to further improve genetically-based risk classification, and to achieve a more rational selection of therapy (based on risk of treatment failure), MRD detection could be used as a new prognostic factor.

If the tailored treatment remains the major goal to be achieved, it still remains a challenge on which clinical option should be considered for improvement of ALL patients with intermediate- or high-risk MRD features. Although the possibility to achieve the same results in the MRD-based intermediate-risk subgroup as those obtained in the MRD-based low-risk subgroup appears to be unlikely, there is still room for treatment options to be tested in large clinical studies. Nachman et al26 recently showed that augmented post-induction chemotherapy results in an excellent outcome for patients with high-risk ALL (1 to 9 years of age and WBC of at least 50 × 109/l or 10 years or older age) and with a slow response to initial therapy. This augmented post-induction therapy could represent a clinical option to be evaluated for medium-risk B cell precursor ALL patients being reassigned to high-risk according to MRD data. Along this line the future clinical studies of the I-BFM-SG will test the relevance of MRD risk classification for pediatric ALL.