세미나 안내

[장소변경]2014년 8월 13일 세미나 일정

작성자
admbioinfo
작성일
2014-08-07
조회
20

생물정보학 세미나 공지

 

서울대학교 생물정보연구소와 생물정보학 협동과정 공동 주최로 특별 세미나를 아래와 같이 열고자 하오니많은 참여 바랍니다.

 

일시: 2014. 8. 13.() 11:00

연사양영익 박사(J. Craig Venter Institute)

장소: 220동 625호 대회의실 

 

Title: SFA-SPA: a suffix array based short peptide assembler for metagenomic data

 

Metagenomics extends the power of genomic analysis to entire communities of microbes, bypassing the need to isolate and culture individual bacterial community members. In this paradigm, DNA is extracted and sequenced directly from an environmental sample. Because of cost effectiveness, next generation sequencing (NGS) technology is routinely used in metagenomics studies. The deep coverage of sequence reads from NGS compensates for a limitation due to the short read length. The massive array of sequence data from an environmental sample provides rich information to understand the microbial community. On the other hand, many new challenges are also exposed such as unknown taxonomic origins and abundance distributions, sequence variations, and non-uniform sequencing depths. Due to these confounding factors, the computational challenge of reconstructing genomes from metagenomic reads is very challenging, with nucleotide assembly often resulting in fragmented contigs. Furthermore, many read sequences remain unassembled. Owing to the poor assembly, subsequent downstream analyses, including analysis of protein sequences in these data, become challenging and unreliable.

We had previously introduced algorithm and software for the accurate reconstruction of protein sequences from short peptides identified on nucleotide reads in a metagenomic dataset. It has been tested on multiple simulated and real data sets and it out-performed a competing strategy, gene prediction after assembly on nucleotide reads, in multiple criteria such as accuracy, read assembled rate, and chimera rate.

Here we present significant computational improvements to the short peptide assembly algorithm that make it practical to reconstruct proteins from large metagenomic datasets containing several hundred million reads, while maintaining accuracy. The improved computational efficiency is achieved using a suffix array data structure that allows for fast querying during the assembly process, and a redesign of the assembly steps that also facilitates a multithreaded execution.

 

최종학력 : Ph.D in Computer Science, Indiana University. 2010

 

최근연구활동

Development of novel short peptide sequence assembler for metagenomic sequence analysis.

Human oral metagenome/metatranscriptome sequence analysis using next generation sequencing data.

Genome/Metagenome projects

 

 

서울대학교 생물정보연구소

생물정보학 협동과정 공동주최