Biostatistics and Bioinformatics

The Georgia Cancer Center's Biostatistics and Bioinformatics Core provides expertise in integrative computational-based analysis solutions to basic, clinical, and translational research applications.

Bioinformatics support ranges in scope from simple consultations to more in-depth collaborations. We require the participation of the investigator during the course of our data analysis because we believe that input into the biological parameters are tantamount to success of the analysis.

Campus users have access to several advanced computing servers owned by the Georgia Cancer Center, including a High Performance Computing Server (HPC) that has 544 total compute cores and an aggregated memory of 2.9TB. The system is composed of 15 PowerEdge R430 1U systems (128 GB RAM each node), 1 PowerEdge R830 (high memory 1024 GB RAM node), and a high-speed 40GbE interconnect for intraserver communication. The HPC also houses 652 TB RAW storage capacity known as Qumulo, allowing the functionalities of effective management and maintenance as well as highly efficient analysis of large data sets, and is committed to the Bioinformatics Shared Resource. Training or a knowledge of Linux is required to use the HPC server.

Mission

Our mission is to provide collaborative support in all areas of Biostatistics work that include, study design, analysis, and interpretation that may involve interaction with industry, government, and regulatory agencies in the areas of Clinical trials, Epidemiology, laboratory studies, in addition to data mining using local and national databases for hypotheses generation and scientific investigation.

The Biostatistics Core (BC) is dedicated to supporting members of the Georgia Cancer Center in their investigative studies and clinical trials. Researchers will find expertise in planning, conducting, analyzing, and reporting, and designing studies relative to clinical trials as well as epidemiologic, and population-based studies.

The Biostatistics investigators also conduct independently sponsored research in statistical analysis, data mining using the Cancer Center registry data, clinical and laboratory, SEER and other national data bases. These studies can greatly benefit the work of Cancer Center members. Some biostatistics core are faculty at the department of Population Health Sciences and provide educational programs to meet the need of the GCC investigators.

For more information please go to the Biostatistics & Bioinformatics section on the Shared Resources page.

Contact Us

Bioinformatics Shared Resource

Health Sciences Campus

GCC - M. Bert Storey Research Building

1410 Laney Walker Boulevard, Augusta, GA

CN-3112

Equipment & Services

Services & Activities

Consultation and quantitative research in collaboration with scientists across basic, population, and clinical sciences, engaged in the planning, conduct and interpretation of research.
Statistical needs for cancer researchers related to protocol design, statistical analysis plans, analysis of clinical trials and finding interpretations, In addition to interaction with sponsors and regulatory agencies.
Statistical needs for non-intervention studies in basic and population sciences.
Organize educational programs for research, clinical faculty, residents, and fellows.
Collaborate with investigators in study development, implementation, and publication by providing assistance with study design, statistical analysis plans, sample size and power considerations, statistical analysis, and grant and manuscript preparation.

Statistical programming in SAS, R, SPSS, STATA.
Data mining of SEER and GCC registry database as well as other national databases for hypotheses generation and answering scientific inquires by researchers.
Serve in Clinical Trials Protocol Review and Monitoring Committee (PRMC).
Utilize and adapt novel statistical methodologies to respond to challenging research issues in ongoing Cancer research, epidemiologic and population-based studies.
Collaborate with academia and government agencies such as CDC, Departments of health, NIH, and university cancer centers and biostatistics departments.

Experiment Types

Whole genome sequencing (WGS)
Whole exome sequencing (WES)
Target sequencing (TS)
Whole transcriptome sequencing (RNA)
ChIP-seq for transcription factors (TF-ChIP)
ChIP-seq for histone marks (HM-ChIP)
Whole genome bisulfite sequencing (WGBS)
Reduced representation bisulfite sequencing (RRBS)

De Novo Genome Assembly

Assemble sequence reads
Assess assembly statistics
Validate an assembly
Run BLAST to a nucleotide database
Compare to the closest public genomes
Ab initio gene prediction

Enrichment Identification

Identify enriched regions (peaks) using statistical models
Generate a table for enriched regions
Generate figures for quality control of peak calling
Generate tag density plots for genomic features
Prepare tracks for the IGV genome browser

Differential Expression

Perform statistical tests
Generate a table for fold change, p-values, and q-values
Generate figures for differential expression analysis

Quality Assessment

Assess the quality of sequencing for various kinds of metrics

De Novo Transcriptome Assembly

Assemble sequence reads
Assess assembly statistics
Run BLAST to a protein database
Compare to the closest public transcriptomes
Find orthologs and paralogs

Functional Annotation

Prepare input files
Compare gene sets with GO terms and pathways

Alternative Splicing

Measure alternative splicing in each sample
Compare samples or groups for splicing changes
Summarize alternative splicing and splicing changes
Generate figures for alternative splicing and splicing change analysis

Read Mapping

Align sequence reads to reference sequences
Summarize mapping results
Generate BAM, and BigWig files for the IGV genome browser

Expression Profile

Generate a table for read counts and FPKMs
Generate figures for quality control analysis
Assess the variation between samples and replicates
Detect outliers

R and Python Programs

Seurat

Cell clusters tSNE and UMAP Cell cluster cell markers
Differentially expressed genes
Cluster and sample based heatmaps
Target gene expressing based violin plots
Cell cycle scoring

Gene Fusion

Generate a table and figure for gene fusion

Sequence Variants

Identify sequence variants
Generate VCF files
Annotate sequence variants
Compare to known databases
Select high quality of variants by user’s criteria
Prepare tracks for the IGV genome browser

Single Cell Analysis

Cellranger and Loupe cell browser by 10X Genomics

Cell clusters tSNE and UMAP
Differentially expressed genes
Cell cluster cell markers
Cluster and sample based heat maps
Target gene expressing based violin plots

Sequence Motif

Motifs from peaks
Search peaks for sequence motifs
Compare to known databases
Generate binding logos for sequence motifs

Methlaytion Profile

Generate a table for beta values
Generate figures for quality control
Assesses the variation between samples and replicates
Detect outliers
Prepare tracks for the IGV genome browser

Structure Variants

Identify structure variants
Generate VCF files
Annotate structure variants
Compare to known databases
Select high quality of variants by user’s criteria
Prepare tracks for the IGV genome browser

Analyses

Functional annotation (GO, Pathyway) - ALL
NCBI deposit - ALL
Quality assessment - ALL
Read mapping - ALL
Alternative splicing - RNA
Differential expression - RNA
Expression profile - RNA
Gene fusion - RNA
Gene set enrichment analysis - RNA
Sequence variants - WGS, WES, TS
Structure variants - WGS
De novo genome assembly - WGS, RNA
De novo transcriptome assembly - RNA
Methylation profile - WGBS, RRBS
Differential methylation - WGBS, RRBS
Enrichment identification - TF-ChIP, HM-ChIP
Differential enrichment - TF-ChIP, HM-ChIP
Sequence motif - TF-ChIP
Single Cell Analysis = scRNA & scATAC

Differential Enrichment

Perform statistical tests
Generate a table for fold change, p-values, and q-values
Generate figures for differential enrichment analysis
Annotate differentially enriched regions
Prepare tracks for the UCSC genome browser

NCBI Deposit

Generate necessary files in appropriate formats
Help to fill the form in meta files
Upload files onto NCBI database

Gene Set Enrichment Analysis

Prepare input files for GSEA
Run GSEA
Summarize the results

Differential Methylation

Perform statistical tests
Generate a table for methylation change, p-values, and q-values
Generate figures for differential methylation analysis
Annotate differentially methylated regions
Prepare tracks for the IGV genome browser

Meet Our Team

Ramses F. Sadek, PhD

Director, Biostatistics Core

(706) 7216930

rsadek@augusta.edu

Pubmed Publications

Li Fang Zhang, MD, MS

Biostatistician & SAS Program Instructor

(706) 721-4453

lizhang@augusta.edu

Pubmed Publications