International Journal of Molecular Medical Science
            
            
              12
            
            
              in the Discovery program and a Monte Carlo strategy
            
            
              in the Affinity program. Each energy-minimized final
            
            
              docking position of the complex was evaluated using
            
            
              the interactive score function in the LUDI module,
            
            
              visual inspection and interface analyses including
            
            
              contact surface area, steric tension, improper rotamer
            
            
              positions by GRASP, CHAIN and O. The final
            
            
              binding position of the complex was determined based
            
            
              on the evaluation of favorable binding interactions
            
            
              using the LUDI score function. The parameters used
            
            
              in this docking strategy included searching for five
            
            
              unique structures, 1 000 minimization steps for each
            
            
              structure, energy range of 10.0 kcal/mol, maximum
            
            
              translation of the ligand of 3.0 Å, maximum rotation
            
            
              of the ligand of 10°, and an energy tolerance of 1500
            
            
              kcal/mol.
            
            
              
                2.4 Bioinformatics and Statistical Analysis of Gene
              
            
            
              
                Expression Profiles
              
            
            
              The publically available archived GSE32311 database
            
            
              was used to compare gene expression changes in
            
            
              CD4
            
            
              +
            
            
              CD8
            
            
              +
            
            
              double-positive wild type (N=3;
            
            
              GSM800500, GSM800501, GSM800502) vs.
            
            
              
                IKZF1
              
            
            
              null mouse thymocytes (N=3,
            
            
              GSM800503,
            
            
              GSM800504, GSM800505) from the same genetic
            
            
              background of (C57BL/6 x129S4/SvJae). Probe level
            
            
              RMA signal intensity values were obtained from the
            
            
              mouse 430_2.0 Genome Array. Up-regulated and
            
            
              down-regulated transcripts in
            
            
              
                IKZF1
              
            
            
              knockout mice
            
            
              were identified by filtering changes greater than 2-fold
            
            
              and T-test P-values less than 0.05 (T-test, Unequal
            
            
              Variances, Excel formula). We identified 1 158
            
            
              transcripts representing 924 genes that were
            
            
              down-regulated in
            
            
              
                IKZF1
              
            
            
              null mice with a subset of
            
            
              201 transcripts representing 137 genes exhibiting
            
            
              >2-fold decreased expression levels.
            
            
              By
            
            
              cross-referencing this IK-regulated gene set with the
            
            
              archived CHiPseq data (GSM803110) using the
            
            
              Integrative Genomics Browser (Robinson et al., 2011),
            
            
              we further identified 45
            
            
              
                Ikaros
              
            
            
              target genes that
            
            
              harbored IK binding sites (Uckun et al., 2012). The
            
            
              Gene
            
            
              Pattern web based software
            
            
              (http://www.broadinstitute.org/cancer/software/genepa
            
            
              ttern) was used to extract expression values from the
            
            
              National Center for Biotechnology Information (NCBI)
            
            
              Gene Expression Omnibus (GEO) database to compile
            
            
              gene expression profiles of human lymphocyte
            
            
              precursors in 1 104 primary leukemia specimens from
            
            
              pediatric ALL patients (GSE3912, N=113; GSE18497,
            
            
              N=82; GSE4698, N=60; GSE7440, N=99; GSE13159,
            
            
              N=750). We focused our analysis on 45 validated
            
            
              
                IK
              
            
            
              target genes (Uckun et al., 2012). Expression values
            
            
              expressed as Standard Deviation units were compiled
            
            
              for the 5 studies and rank ordered according to the
            
            
              mean expression of three highly correlated transcripts
            
            
              (208642_s_at (
            
            
              
                XRCC5
              
            
            
              ), 208643_s_at (
            
            
              
                XRCC5
              
            
            
              ),
            
            
              200792_at (
            
            
              
                XRCC6
              
            
            
              ). Prospective power analysis was
            
            
              utilized to determine the Standard Deviation cut-off
            
            
              for “high
            
            
              
                Ku
              
            
            
              expression” and “low
            
            
              
                Ku
              
            
            
              expression” in
            
            
              the data sets. To control for False Positive Rate (FPR)
            
            
              to detect for differences in 3
            
            
              
                Ku
              
            
            
              transcripts out of
            
            
              approximately 20 000 transcripts common across the 5
            
            
              Affymetrix platforms, we set the unadjusted P-value at
            
            
              2.5×10
            
            
              -6
            
            
              (FPR = 0.05). Sample size greater than 132
            
            
              would provide sufficient to detect a difference of 1
            
            
              standard deviation units with 99.9% power. Therefore,
            
            
              samples were assigned to the “high
            
            
              
                Ku
              
            
            
              expression”
            
            
              group if their expression level was >0.5 standard
            
            
              deviations units higher than the mean expression level
            
            
              (N=314) and to the “low
            
            
              
                Ku
              
            
            
              expression” group if their
            
            
              expression level was >0.5 standard deviations units
            
            
              lower than the mean expression level (N=324). These
            
            
              samples were also rank ordered according to
            
            
              
                IKZF1
              
            
            
              expression level
            
            
              (205038_at,
            
            
              205039_s_at,
            
            
              216901_s_at, 227344_at and 227346_at; 3 of these
            
            
              were common in all Affymetrix platforms - 205038_at,
            
            
              205039_s_at, 216901_s_at) resulting in 302 ALL
            
            
              samples with high
            
            
              
                IKZF1
              
            
            
              expression and 318 samples
            
            
              with low
            
            
              
                IKZF1
              
            
            
              expression. T-tests were performed
            
            
              for the combined Standard Deviation units from the 5
            
            
              datasets (2-sample, Unequal variance correction,
            
            
              p-values<0.05 deemed significant) and revealed 27
            
            
              transcripts representing 19 IK target genes (Table 1)
            
            
              and 13 transcripts representing 12 lymphoid-priming
            
            
              genes (Uckun et al., 2012; Ma et al., 2013) that were
            
            
              significantly up-regulated in samples with both high
            
            
              
                Ku
              
            
            
              and high
            
            
              
                IKZF1
              
            
            
              expression levels. We used a
            
            
              one-way agglomerative hierarchical
            
            
              clustering
            
            
              technique to organize expression patterns using the
            
            
              average distance linkage method such that genes
            
            
              (rows) having similar expression across patients were
            
            
              grouped together (average distance metric).
            
            
              Dendrograms were drawn to illustrate similar
            
            
              gene-expression profiles from joining pairs of closely
            
            
              Molecular Medical Science, Int’l Journal of