The Georgia Cancer Center's Biostatistics and Bioinformatics Core provides expertise in integrative computational-based analysis solutions to basic, clinical, and translational research applications.

Bioinformatics support ranges in scope from simple consultations to more in-depth collaborations. We require the participation of the investigator during the course of our data analysis because we believe that input into the biological parameters are tantamount to success of the analysis.

Campus users have access to several advanced computing servers owned by the Georgia Cancer Center, including a High Performance Computing Server (HPC) that has 544 total compute cores and an aggregated memory of 2.9TB.  The system is composed of 15 PowerEdge R430 1U systems (128 GB RAM each node), 1 PowerEdge R830 (high memory 1024 GB RAM node), and a high-speed 40GbE interconnect for intraserver communication.  The HPC also houses 652 TB RAW storage capacity known as Qumulo, allowing the functionalities of effective management and maintenance as well as highly efficient analysis of large data sets, and is committed to the Bioinformatics Shared Resource. Training or a knowledge of Linux is required to use the HPC server.

Mission

Our mission is to provide collaborative support in all areas of Biostatistics work that include, study design, analysis, and interpretation that may involve interaction with industry, government, and regulatory agencies in the areas of Clinical trials, Epidemiology, laboratory studies, in addition to data mining using local and national databases for hypotheses generation and scientific investigation.

The Biostatistics Core (BC) is dedicated to supporting members of the Georgia Cancer Center in their investigative studies and clinical trials. Researchers will find expertise in planning, conducting, analyzing, and reporting, and designing studies relative to clinical trials as well as epidemiologic, and population-based studies.

The Biostatistics investigators also conduct independently sponsored research in statistical analysis, data mining using the Cancer Center registry data, clinical and laboratory, SEER and other national data bases. These studies can greatly benefit the work of Cancer Center members. Some biostatistics core are faculty at the department of Population Health Sciences and provide educational programs to meet the need of the GCC investigators.

For more information please go to the Biostatistics & Bioinformatics section on the Shared Resources page.

Contact Us

Bioinformatics Shared Resource

Health Sciences Campus

Georgia Cancer Center - M. Bert Storey Research Building

1410 Laney Walker Boulevard, Augusta, GA

CN-3112

Equipment & Services

Services & Activities

  • Consultation and quantitative research in collaboration with scientists across basic, population, and clinical sciences, engaged in the planning, conduct and interpretation of research.

  • Statistical needs for cancer researchers related to protocol design, statistical analysis plans, analysis of clinical trials and finding interpretations, In addition to interaction with sponsors and regulatory agencies.

  • Statistical needs for non-intervention studies in basic and population sciences.

  • Organize educational programs for research, clinical faculty, residents, and fellows.

  • Collaborate with investigators in study development, implementation, and publication by providing assistance with study design, statistical analysis plans, sample size and power considerations, statistical analysis, and grant and manuscript preparation.
  • Statistical programming in SAS, R, SPSS, STATA.

  • Data mining of SEER and GCC registry database as well as other national databases for hypotheses generation and answering scientific inquires by researchers.

  • Serve in Clinical Trials Protocol Review and Monitoring Committee (PRMC).

  • Utilize and adapt novel statistical methodologies to respond to challenging research issues in ongoing Cancer research, epidemiologic and population-based studies.

  • Collaborate with academia and government agencies such as CDC, Departments of health, NIH, and university cancer centers and biostatistics departments.

Experiment Types

  • Whole genome sequencing (WGS)
  • Whole exome sequencing (WES)
  • Target sequencing (TS)
  • Whole transcriptome sequencing (RNA)
  • ChIP-seq for transcription factors (TF-ChIP)
  • ChIP-seq for histone marks (HM-ChIP)
  • Whole genome bisulfite sequencing (WGBS)
  • Reduced representation bisulfite sequencing (RRBS)

De Novo Genome Assembly

  • Assemble sequence reads
  • Assess assembly statistics
  • Validate an assembly
  • Run BLAST to a nucleotide database
  • Compare to the closest public genomes
  • Ab initio gene prediction

Enrichment Identification

  • Identify enriched regions (peaks) using statistical models
  • Generate a table for enriched regions
  • Generate figures for quality control of peak calling
  • Generate tag density plots for genomic features
  • Prepare tracks for the IGV genome browser

Differential Expression

  • Perform statistical tests
  • Generate a table for fold change, p-values, and q-values
  • Generate figures for differential expression analysis

 

Quality Assessment

  • Assess the quality of sequencing for various kinds of metrics

De Novo Transcriptome Assembly

  • Assemble sequence reads
  • Assess assembly statistics
  • Run BLAST to a protein database
  • Compare to the closest public transcriptomes
  • Find orthologs and paralogs

Functional Annotation

  • Prepare input files
  • Compare gene sets with GO terms and pathways

Alternative Splicing

  • Measure alternative splicing in each sample
  • Compare samples or groups for splicing changes
  • Summarize alternative splicing and splicing changes
  • Generate figures for alternative splicing and splicing change analysis

 

Read Mapping

  • Align sequence reads to reference sequences
  • Summarize mapping results
  • Generate BAM, and BigWig files for the IGV genome browser

Expression Profile

  • Generate a table for read counts and FPKMs
  • Generate figures for quality control analysis
  • Assess the variation between samples and replicates
  • Detect outliers

R and Python Programs

Seurat
  • Cell clusters tSNE and UMAP Cell cluster cell markers
  • Differentially expressed genes
  • Cluster and sample based heatmaps
  • Target gene expressing based violin plots
  • Cell cycle scoring

Gene Fusion

  • Generate a table and figure for gene fusion

 

Sequence Variants

  • Identify sequence variants
  • Generate VCF files
  • Annotate sequence variants
  • Compare to known databases
  • Select high quality of variants by user’s criteria
  • Prepare tracks for the IGV genome browser

Single Cell Analysis

Cellranger and Loupe cell browser by 10X Genomics
  • Cell clusters tSNE and UMAP
  • Differentially expressed genes
  • Cell cluster cell markers
  • Cluster and sample based heat maps
  • Target gene expressing based violin plots

Sequence Motif

  • Motifs from peaks
  • Search peaks for sequence motifs
  • Compare to known databases
  • Generate binding logos for sequence motifs

Methlaytion Profile

  • Generate a table for beta values
  • Generate figures for quality control
  • Assesses the variation between samples and replicates
  • Detect outliers
  • Prepare tracks for the IGV genome browser

 

Structure Variants

  • Identify structure variants
  • Generate VCF files
  • Annotate structure variants
  • Compare to known databases
  • Select high quality of variants by user’s criteria
  • Prepare tracks for the IGV genome browser

Analyses

  • Functional annotation (GO, Pathyway) - ALL
  • NCBI deposit - ALL
  • Quality assessment - ALL
  • Read mapping - ALL
  • Alternative splicing - RNA
  • Differential expression - RNA
  • Expression profile - RNA
  • Gene fusion - RNA
  • Gene set enrichment analysis - RNA
  • Sequence variants - WGS, WES, TS
  • Structure variants - WGS
  • De novo genome assembly - WGS, RNA
  • De novo transcriptome assembly - RNA
  • Methylation profile - WGBS, RRBS
  • Differential methylation - WGBS, RRBS
  • Enrichment identification - TF-ChIP, HM-ChIP
  • Differential enrichment - TF-ChIP, HM-ChIP
  • Sequence motif - TF-ChIP
  • Single Cell Analysis = scRNA & scATAC

Differential Enrichment

  • Perform statistical tests
  • Generate a table for fold change, p-values, and q-values
  • Generate figures for differential enrichment analysis
  • Annotate differentially enriched regions
  • Prepare tracks for the UCSC genome browser

NCBI Deposit

  • Generate necessary files in appropriate formats
  • Help to fill the form in meta files
  • Upload files onto NCBI database

 

Gene Set Enrichment Analysis

  • Prepare input files for GSEA
  • Run GSEA
  • Summarize the results

 

Differential Methylation

  • Perform statistical tests
  • Generate a table for methylation change, p-values, and q-values
  • Generate figures for differential methylation analysis
  • Annotate differentially methylated regions
  • Prepare tracks for the IGV genome browser

Meet Our Team

photo of Ramses F. Sadek, PhD

Ramses F. Sadek, PhD

  • Director, Biostatistics Core

(706) 7216930

photo of Li Fang Zhang, MD, MS

Li Fang Zhang, MD, MS

  • Biostatistician & SAS Program Instructor

(706) 721-4453