Internship Mentors 2007

The 2007 mentors are listed below. The mentor list and content are still in progress of being updated.

Andrew Cameron (Caltech) | William A. Goddard III (Caltech) | Ian Haworth & Rebecca Romero (USC) | Gary Larson (City of Hope) | Jamil Momand & Jongwook Woo (CSULA) | Eric Mjolsness (UCI) | Jeanette Papp (UCLA) | Matteo Pellegrini (UCLA) | Soheil Shams (BioDiscovery) | Bruce Shapiro (Caltech) | Janet Sinsheimer (UCLA) | Barbara Wold (Cal Tech)

Andrew Cameron, Ph.D (Caltech)

http://sugp.caltech.edu

Description of CCRG

Division Lab Website

Dr. Cameron's Personal Website

Caltech Description:

The Center for Computational Regulatory Genomics maintains the sea urchin genome resource, conducts basic research on genome-related questions in developmental biology, and constructs and maintains the sea urchin genome database, SpBase.

Projects for Interns:

The SpBase project is just starting up this winter and there will be a number of different curation, data mining and database projects to do. These could involve a little lab work but most are purely computational.

Intern Requirements:

Some experience with linux-based programming, such as script writing; at least a nodding acquaintence with PostgresQL databases and some exposure to molecular biology and genomics.

Internship Openings:

Up to 1 student.

Back to Top



William A. Goddard, Ph.D (Caltech)

http://www.caltech.edu

City of Hope Description:

The G-protein coupled receptor (GPCR) superfamily of membrane proteins play a critical role in the communication of cells with their environment by allowing them to sense extracellular signals and by transporting important molecules in and out of the cell. They transduce an extracellular signal (agonist binding) into an intracellular signal (G-protein activation). They are involved in many stimulus-response pathways ranging from sensory perception (vision, smell, taste, touch, pain) to intercellular communication. GPCRs are important drug targets as they are involved in many disease processes. The lack of 3-dimensional structures of GPCRs (the only one available is Bovine Rhodopsin which has only 20% homology to the human GPCRs of interest, too small for useful homology built structures) has prevented a rational design of drugs.

Projects for Interns:

A recent breakthrough in the Goddard lab at Caltech has led to computational methods to predict the 3-dimensional structures for GPCRs and for agonists and antagonists bound to them. Several systems have been successful (see references). With the current tools and understanding, a really good, hard working (and lucky) student might in the 7 weeks of the research part of the CSULA Bioinformatics project be able to predict the 3D structure of one GPCR and test it computationally by predicting the binding site for an agonist and antagonist. Projects that could be suitable include olfactory receptors, taste receptors, serotonin receptors, chemokine receptors, histidine receptors, frizzled receptors, and many others.

Internship Openings:

Up to 2 students.

Back to Top



Ian S. Haworth, Ph.D & Rebecca M. Romero, Ph.D (USC)

http://www-hsc.usc.edu/~ihaworth

University of Southern California Description:

Our laboratory is interested in the computational design of biomolecular interfaces, generally with a therapeutic or diagnostic goal. We also have projects in computational analysis of nucleic acid folding, including fitting of data to experimental distances derived from ESR, and analysis of DNA bending in formation of condensed DNA for non-viral gene delivery. Most of our work is based on new algorithms that have been or are being developed in the laboratory. We also use commercially available algorithms for established methods such as molecular dynamics simulations and database searching.

Projects for Interns:

(i) RNA Aptamer Design: Aptamers are small nucleic acid fragments (typically about 30 bases) that bind with high affinity to a specific protein target. The project would involve the design of an aptamer against a particular protein, using a series of in-house algorithms for docking of RNA bases to a protein structure, fitting a backbone to the docked bases, solvating the complex, and running molecular dynamics simulations on the predicted complex.

(ii) Analysis of Nucleic Acid Folding: In collaboration with Dr. Peter Qin (USC Department of Chemistry), we are developing an algorithm to predict nucleic acid folding using constraints from EPR analysis of spin-labeled nucleic acids. The project will involve building spin-labeled nucleic acid structures using an in-house algorithm (NASDAC) and fitting theoretical data to experimental data.

(iii) Prediction of MHC-Peptide-TCR Interactions: Major histocompatibility complex (MHC) proteins play a key role in the immune response by binding peptides (that are derived from host or non-host proteins) and presenting them to T-cell receptors (TCRs). The project will involve use and development of two in-house algorithms, PePSSI and PePSSI-TCR, for docking of peptides to MHC molecules and subsequent investigation of the interaction of the peptide complex with the TCR.

Intern Requirements:

The most important attribute for the intern will be a good appreciation of biomolecular structure or, at least, the desire to learn quickly about the structure of proteins and nucleic acids. Familiarity with any form of computer programming will be useful, but this is not absolutely essential. Attention to detail will be very important in developing input parameters for running the algorithms, and good skills in numerical and structural analysis are needed for interpretation of results.

Internship Openings:

Up to 2 students.

Back to Top



Gary Larson, Ph.D (City of Hope)

http://www.cityofhope.org

City of Hope Description:

Since only one percent of the human DNA comprises the exonic protein encoding regions of the genome it seems logical to identify mutations that influence disease susceptibility by mechanisms other than protein structure or function. One mechanism worthy of pursuit is gene expression. Expression signatures are extremely powerful and are capable of distinguishing tumors from patients possessing either BRCA1 or BRCA2 mutations. This suggests that heritable risk variants (ie. disease polymorphisms) in cis-acting transcriptional control elements may be one possible explanation leading to the aberrant transcript levels observed in breast tumors. To identify these expression risk variants dysregulated in breast cancer our group uses a combination of statistics, comparative phylogenetics and family-based linkage methodologies. We perform meta-analysis and bioinformatics based analyses of multiple, publicly available BrCa microarray datasets to identify statistically “worthy” candidates. We subsequently utilize extensive bioinformatic and comparative phylogenetic analyses of our candidates in orthologs to both select genes worthy for genetic experimentation along with the identification of evolutionarily conserved transcriptional regulatory elements. We also employ genetic enrichment strategies in a previously acquired cohort of multiplex families with disease (sibling pairs) using allele-sharing enrichment and postulated gene-gene interactions. Putative high-risk transcriptional alleles will be characterized to demonstrate abnormal interactions with elements of the transcriptional apparatus in biochemical assays. Student participants should expect exposure to a mixed bag of both concurrent laboratory experimentation and be prepared to query external databases for specific targets, construct relational DBs, and utilize mining approaches to integrate diverse datasets.

Internship Openings:

Up to 1 student.

Back to Top



Jamil Momand, Ph.D & Jongwook Woo, Ph.D (CSULA)

Cal State University, Los Angeles Website

CSULA Description:

In organisms, oxidation of proteins has been thought to be a spontaneous event. Once the protein is oxidized it is likely to be incapable of performing its normal function. It is the buildup of oxidized proteins that can lead to certain diseases including Alzheimer's disease and amyotrophic lateral sclerosis. The amino acid side chain most susceptible to oxidation is cysteine. This amino acid contains a thiol group that can be oxidized to form a disulfide (S-S), sufenic acid (S-OH), sulfinic acid (SO2H), and sulfonic acid (SO3H). Other oxidation products are also possible. Very little is known as to what makes a cysteine thiol group susceptible to oxidation other than the thiol must be accessible to solvent. No software programs are available to predict which thiol groups on the surface of proteins are susceptible to oxidation. It appears, that stabilizing the thiolate ion leads to increased susceptibility to oxidation. Given this information, it should be possible to create a software program that searches the Protein Data Bank and ranks proteins for susceptibility to oxidation.

Projects for Interns:

Determine from the literature, which proteins fit the following criteria:

a) The oxidation site on the cysteine thiol group has been mapped.
b) The sites on the protein that are not oxidized has been mapped.
c) The structure of the reduced protein is solved and its atomic coordinates have been deposited in the Protein Data Bank.

The student will continue to work on a software program that we created called Cysteine Oxidation Prediction Program that will use the information above to calculate the conditions (i.e. solvent accessibility and neighboring polar atoms, and distance of polar atoms) that correlate with cysteine thiol oxidation.

The program will analyze every thiol on a training set of proteins with known structure and oxidation status. The program will be optimized to return low false positive and low false negative data. COPP will be used to test all proteins deposited in the Protein Data Bank.

Internship Openings:

Up to 2 students.

Back to Top



Eric Mjolsness, Ph.D (UCI)

www.uci.edu

UC Irvine Description:

Computable Plant project, www.computableplant.org, working at UCI.

Projects for Interns:

One possible assignment is in the outreach to high school science teachers (see the outreach section of the foregoing web site). Another is to try out new hypotheses with the modeling software.

Intern Requirements:

A primary skill would be the ability to learn to model in Cellerator www.cellerator.org which uses a computer algebra system, and to show other people how to do the same.

Internship Openings:

Up to 1 student.

Back to Top



Jeanette Papp, Ph.D (UCLA)

Bioinformatics in Gene Mapping
http://www.genetics.ucla.edu/sequencing

UCLA Description:

Dr. Papp is the Director of the UCLA Genotyping and Sequencing Core Facility, and a member of the UCLA Bioinformatics Core. In addition to overseeing data generation and analysis in the laboratory, her research interests include developing novel bioinformatic solutions for the management and analysis of all types of genetic data within the Department of Human Genetics.

Projects for Interns:

Mendel is a comprehensive package for exact statistical genetic analysis of qualitative and quantitative traits.  Mendel is widely used in genetic studies to localize susceptibility genes for complex diseases.  The intern's project will be to develop a web application that stores genomic and phenotypic data, and integrates with the computational engine of the Mendel statistical software package.

Intern Requirements

Candidate must be proficient in designing rich UI web application and have an understanding of RDBMS.  Candidate must have programming experience in PHP and understanding of object oriented programming concepts.  In addition, candidates should have some experience with developing interactive web applications using asynchronous requests such as AJAX. Previous experience of CakePHP is a plus.  Candidate should be highly motivated and self-directed.

Internship Openings:

Up to 2 students.

Back to Top



Matteo Pellegrini, Ph.D (UCLA)

http://pellegrini.mcdb.ucla.edu

UCLA Description:

Our lab is interested in developing computational approaches to reverse engineer molecular networks. These network models allow us to elucidate the mechanisms of signal transduction, transcription and metabolism. Our approach is to build models that integrate varied data including measurements of gene expression, protein binding, phosphorylation and genome sequences. For example, we use genome sequence data to infer networks of co-evolving proteins, which allow us to study the function of most proteins. Currently, we are also developing methods to reconstruct dynamical networks of transcriptional regulation. Our long-term goal is to build network models that allow us to quantitatively predict the outcome of perturbations in cells.

Projects for Interns:

Various projects that involve the analysis of expression microarrays to both estimate the activities of transcription factors that regulate changes in gene expression seen in a particular experiment, as well as the application of techniques to model the network of regulatory activities between these transcription factors.

Intern Requirements:

The ability to program and perform data analysis. We typically use Matlab to perform our analysis. However, knowledge of any programming language would be sufficient for these projects.

Internship Openings:

Up to 2 students.

Back to Top



Soheil Shams Ph.D (BioDiscovery)

www.biodiscovery.com

BioDiscovery Description:

Gene microarrays have become recognized as powerful tools for providing a global view of gene expression regulation for a biological condition of interest. The other side to this double-edged sword is that such studies produce large amounts of interesting numerical results, making it difficult to get an intuitive grasp on what is happening biologically: Our company, BioDiscovery, is dedicated to providing researchers useful software tools for gleaning biological meaning from large data sets. One area of interest is the discovery of gene interactions, useful in elucidating novel biological mechanisms. The sum me r internship project will involve applying tools such as clustering analysis, self-organizing maps, and genomic pathways to the discovery of new biological interactions between genes.

Projects for Interns:

Interns will be involved in establishing a microarray knowledge base. The task will involve researching published papers using array technology, understanding the goal of the research and downloading of raw data. Processing the raw data using BioDiscovery proprietary tools and integrating the results with other project results into a cohesive knowledge base. The knowledge base will then be queried to generate novel information. The project involves integrating gene expression and CGH data as well.

Intern Requirements:

BioDiscovery is looking for individuals that are self starters interested in working in a commercial environment. No explicit programming tasks are envisioned but ability to write macros is a plus. Understanding of microarray technology is a must.

Internship Openings:

Up to 2 students.

Back to Top



Bruce Shapiro, Ph.D (Caltech)

www.caltech.edu

Caltech Description:

The Biological Network Modeling Center (bnmc.caltech.edu) is directly involved in a wide variety of interdisciplinary projects, putting the BNMC at an exciting intersection of talent and activities in computation, biology, and theory. An intern might choose to work on one of the following projects. A student could design a project that overlaps one or more of our research areas.

Intern Requirements:

Students should have the following background: calculus, some understanding of differential equations (a full course is not necessary, just knowing what they are and having an interest in solving them numerically); a desire to work intensively doing computer modeling or computer programming; Modeling will be done with Mathematica but no prior knowledge of Mathematica is required; enough background in biology to know what a signal transduction network is. All work would be done at Caltech. For some projects students would coordinate their work with wet-bench researchers at Caltech. No wet bench research is involved in any of these projects.

Internship Openings:

Up to 2 students.

Back to Top



Janet Sinsheimer, Ph.D (UCLA)

http://www.biostat.ucla.edu/people/sinshmer.htm

UCLA Description:

We develop statistical methodology for mapping complex trait and disease genes. We are particularly interested in understanding gene by gene and gene by environmental interactions and their role in disease susceptibility. For example, our research has shown that specific interactions between maternal and fetal genes may produce an adverse prenatal environment that increases the risk of complex diseases in later life. For example, we found that matching between maternal and fetal human leukocyte antigen (HLA) genes can lead to increased risk of schizophrenia. This maternal-fetal genotype interaction is consistent with the immunological intolerance hypothesis that posits that HLA similarity between mother and fetus fails to stimulate the blocking antibodies that normally protect the fetus from the mother's immune response.

Projects for Interns:

We are currently analyzing genotype data with the goal of mapping polymorphisms and looking for gene by gene or gene by environment interactions that can increase risk of schizophrenia, rheumatoid arthritis, type II diabetes or breast cancer. An intern would assist in the analysis of these data. They would also help in the data management by developing programs that can automate the analyses.

Intern Requirements:

The ideal intern will have a taken a course in statistics and so understands inference and estimation. She/he will be proficient in some programming language and enjoy programming. She/he should also have some familiarity with mapping genes in humans.

Internship Openings:

Up to 1 student.

Back to Top



Barbara Wold, Ph.D (Cal Tech)

www.caltech.edu

Projects for Interns:

1. Informatics of a new approach to genomewide expression measurements that features integral splice isoform mapping.

2. Refining, implementing and testing of an algorithm designed to computationally predict the microRNA component of gene networks. The the muscle differentiation, degeneration and regeneration network is the focal test case.

3. Developing a user interface to link and visualize comparative genomics data (phylogenetic footprinting of preferentially conserved or highly variable domains) onto structural models of proteins and vice versa.

4. Others arising by june 2007.

Information for BioHub can be found at: http://woldlab.caltech.edu/biohub

Information for CompClust can be found at: http://nar.oxfordjournals.org/cgi/content/full/33/8/2580

Internship Openings:

Up to 2 students.

Back to Top