The iResearch Institute 2021 application process opens on Nov. 23, 2020

iResearch Institute 2020 Highlights

Satvik Dasariraju

Detection and Classification of Immature Leukocytes for Diagnosis of Acute Myeloid Leukemia Using Random Forest
Hometown: Lawrenceville, NJ
Mentor:
 Marc Huo, Stanford University

Acute Myeloid Leukemia (AML) is a fatal blood cancer that must be detected early for effective treatment, but the diagnosis is time-consuming and inaccurate. This study presents a machine learning model capable of automatic screening for immature blood cells, which are a strong sign of AML, based on geometric and color features of cells. The proposed model detected immature cells with 92.99% accuracy and classified immature cells into four types with 93.45%, demonstrating that the model can be an efficient support tool for clinicians diagnosing AML.

Satvik Elayavalli

Link Between Systemic Lupus Erythematosus and Diffuse Large B Cell Lymphoma to Identify common targets for therapy
Hometown: Bengaluru Karnataka, India
Mentor: Kendra Zhang, Columbia University

The NF-κB signaling pathway has been explored in Systemic Lupus Erythematosus (SLE), an autoimmune disease, and Diffuse Large B Cell Lymphoma (DLBCL), an aggressive non-Hodgkin’s lymphoma. This pathway synthesizes pro-inflammatory proteins in SLE and anti-apoptotic proteins in DLBCL. Datasets obtained from GEPIA2, STRING, and mirNet were analyzed using Cytoscape and highlighted microRNA 21 (mir-21) as a potential therapeutic target due to its inhibitory effects. miR-21 inhibits the gene PTEN, which activates the PI3K/AKT pathway, subsequently activating the NF-κB pathway. These relationships have not been completely explored in SLE, suggesting a potentially novel importance for miR-21 as a therapeutic target.

Aaroosh Ramadorai

Searching for the host galaxy of FRB 181017
Hometown: Lexington, MA
Mentor: Kaitlyn Shin, Massachusetts Institute of Technology (MIT)

Fast radio bursts (FRBs) are millisecond-duration radio wave pulses of unknown astrophysical origins. Here, I search for the host galaxy of the FRB 181017, discovered by the Molonglo Observatory Synthesis Telescope (UTMOST). Using comparisons with other FRBs, I propose a neutron star to be a likely origin for this FRB. I then use publicly available code and data to constrain the FRB’s redshift and celestial coordinates to obtain a sample of 13 host candidates. I consider 4 galaxies as likely candidates due to their high Hα fluxes and similar sizes to the Milky Way—both of which may imply high star formation rates, and therefore large neutron star populations from which FRB 181017 could originate.

Ryan Rudes

A Simple System for Strengthening Visual Representations
Hometown: Dix Hills, NY
Mentor: David Xu, Columbia University

Semi-supervised learning involves pretraining models upon large collections of arbitrary data; thereafter, the knowledge acquired may be applied towards a downstream task, allowing a network to generalize well with a small quantity of pre-separated training data. The recent state-of-the-art approaches to self-supervised pretraining are predominantly encompassed by contrastive learning techniques. The current state-of-the-art, SimCLR, learns a generalized understanding of visual representations by making comparisons between transformed copies of each instance in a large image dataset. We investigate the benefits of applying a learning problem of progressively growing difficulty to this dominating approach. Specifically, we propose a mechanism which enables more explicit control over the strength of the data augmentation operator throughout training, allowing us to intrinsically disincentivize the temporary exploitation of non-generalizable features, and instead, enforce the gradual attainment of reliable feature information.

Katie Sie

Bioinformatic Analyses Determine Biomarkers for Resistive Small Cell Lung Cancer
Hometown: Oakland Gardens, NY
Mentor: Madhav Subramanian, Washington University at St. Louis (WashU)

Treatment refraction is a hallmark of small cell lung cancer (SCLC), necessitating further research to mitigate poor prognosis. Datasets of untreated and DNA-damage-repair-inhibitor treated samples were obtained. Gene set enrichment analysis (GSEA) was performed to identify upregulated pathways, elucidating resistive mechanisms in treated samples. Prominent genes in identified pathways were visualized in RStudio and analyzed in GEPIA2 for their impact on survival. Reactive oxygen species pathway and TGF Beta Signaling, whose genes are involved in evading apoptosis and promoting tumor growth, were upregulated. This study identifies novel genes that hold promising potential as therapeutic targets to resensitize tumors.

Sahand Adibnia

Modeling the Onset of Parkinson’s Disease: Dopamine Oxidation and Age-Dependent Neuromelanin Accumulation as Combined Multiclass Markers
Hometown: Dublin, CA
Mentor: Kendra Zhang, Columbia University

Parkinson’s disease is characterized by the loss of neuromelanin-containing dopaminergic neurons in the substantia nigra. Non-melanized dopaminergic neurons are relatively spared. Aminochrome has been suggested as the initial trigger of neurotoxicity in Parkinson’s. The genes NQO1 and GSTM2 metabolize aminochrome into non-neurotoxic compounds, which convert to neuromelanin. The non-melanized striatum exhibits higher expression of NQO1 and GSTM2 than the substantia nigra. This indicates that striatal aminochrome is metabolized without forming neuromelanin. NQO1 expression in the substantia nigra has a negative correlation with age, indicating that aminochrome levels increase with age. This could contribute to the increased incidence of Parkinson’s in the elderly.

Srihitha Dasari

Multiclass Classification of Alzheimer’s Disease: MRI Extraction of Cortical Volumetry and Inter-Cortical Ratios as Combined Markers
Hometown: Cumming, GA
Mentor: Marc Huo, Stanford University

Clinical diagnostic limitation and progressive severity of Alzheimer’s disease necessitates improved earlier detection for effective treatment administration. This study aimed to determine features with which the accuracy of a multiclass classification could increase, extracting inter-cortical volumetric ratios as a mode of normalization over traditional absolute volumes. Cortical tissues were segmented from pre-processed T1w MRIs and their volumes were extracted and used, along with computed ratios, in three multiclass classifiers: conventional volumes, proposed ratios, and combined features. The combination of raw volumes and ratios as features achieved the greatest accuracy and certain ratios were given more feature importance over absolute volumes, indicating promising uses of ratios as volumetric features to increase performance. The proposed algorithm included more classes (5) and achieved a greater accuracy (81.03%) than state-of-the-art supervised approaches, suggesting increased ability in delineating different stages and potentially greater efficacy in medication administration.


Isabelle Garcia-Fischer

Identifying the Potential Roles of Cadherins in a Defective Wnt/B-Catenin Pathway in Hypermobile Ehlers Danlos Syndrome
Hometown: Danbury, CT
Mentor: Swati Madankumar, Barnard College

Hypermobile Ehlers Danlos Syndrome (hEDS) is a rare connective tissue disorder with no known genetic biomarker. To identify possible genetic etiologies of hEDS, the functions of genes dysregulated in hEDS skin fibroblasts were correlated with known cellular phenotypes and symptoms of hEDS. The downregulation of cadherin 2 and upregulation of cadherin 11 may cause mechanical stress and inflammation, respectively, which may disrupt Wnt/B-catenin signaling transduction, ECM protein organization, and connective tissue homeostasis in likely multiple organs in hEDS patients. This study guides future in vitro experiments to confirm the roles of these genes that can potentially guide hEDS diagnosis.

Ethan Horowitz

CET-CNN: Modular Hierarchical Image Classification Using Conditional-Execution Tree CNNs
Hometown: Manhasset, NY
Mentor: David Xu, Columbia University

Sequential neural network classifiers overgeneralize between classes, therefore a tree-structured architecture was designed - Conditional-Execution Tree Convolutional Neural Network (CET-CNN) - enabling more specialized classification using subunits that classify and process inputs to pass to more subunits. CET-CNN-A subunits performed multi-class classification while CET-CNN-B subunits performed binary classification. CET-CNNs were compared to sequential networks and run with full, conditional, and single-path execution. CET-CNN-B’s achieved a higher accuracy than CET-CNN-A’s and sequential networks. CET-CNN-B’s and CET-CNN-A’s were most accurate with full-execution and single-execution, respectively. CET-CNN-B’s improved accuracy over sequential networks by 8.25%. Single-path execution reduced effective network size by up to 81%.

Katie Sie

Bioinformatic Analyses Determine Biomarkers for Resistive Small Cell Lung Cancer
Hometown: Oakland Gardens, NY
Mentor: Madhav Subramanian, Washington University at St. Louis (Wash U)

Treatment refraction is a hallmark of small cell lung cancer (SCLC), necessitating further research to mitigate poor prognosis. Datasets of untreated and DNA-damage-repair-inhibitor treated samples were obtained. Gene set enrichment analysis (GSEA) was performed to identify upregulated pathways, elucidating resistive mechanisms in treated samples. Prominent genes in identified pathways were visualized in RStudio and analyzed in GEPIA2 for their impact on survival. Reactive oxygen species pathway and TGF Beta Signaling, whose genes are involved in evading apoptosis and promoting tumor growth, were upregulated. This study identifies novel genes that hold promising potential as therapeutic targets to resensitize tumors.

Emma Wang

Investigating Potential Agro-economic Benefits of Solar Pollinator Habitats
Hometown: Manhasset, NY
Mentor: Anjali Chadha, Massachusetts Institute of Technology (MIT)

American beekeepers lose approximately 42% of their colonies annually, affecting hundreds of pollination-dependent crops and costing the agro-economy $15 billion each year. This study analyzed the potential benefits of establishing solar-pollinator habitats in the U.S., as co-location of solar energy and agriculture can protect pollinators and crops and simultaneously produce energy. Crops were classified by pollinator-dependence, and overlapping areas of solar facilities and cropland were measured. Highly pollinator-dependent crops were more economically valuable than less-dependent crops, and solar-agriculture overlap areas had enormous power capacities, showing the potential to provide clean energy for millions while providing habitats for plant and pollinator species.