. . . .

News and resources, Jan 24th

Posted on 24, Jan, 2012 by in

Quick update before the end of the month:


The long-awaited paper from Eric Lander on missing heritability has been published and is causing a new round of discussions, following the initial debate after the ASHG announcement. Luke has a short summary over at Genomes Unzipped, and Steve Hsu delves into the supplementary material. A good addition to this is Genetic Inference’s report on the current state of complex trait sequencing via ICHG2011.


BGI has started to move the sequence analysis to GPU-based servers, though the article in Wired is unfortunately light on details what algorithms ended up getting ported to the different architecture.

IonTorrent meanwhile starts supporting paired-end reads, sort of, and announces a new machine which is a bit of a disappointment as one of their main selling arguments was that improvements would happen through the chips, not the hardware around it. Be that as it may, we are getting close to the $1000 genome — for the data generation, which doesn’t take the very time consuming data analysis into account. This is also reflected in my favorite quote of Elaine Mardis’ interview with The Scientist:

“It makes me crazy to listen to people say sequencing is so cheap. Sure, generating data is cheap. But then what do you do with it? That’s where it gets expensive. ‘The $1,000 genome’ is just this throwaway phrase people use and I don’t think they stop to think about what’s involved. When we looked back after we analyzed the first tumor/normal pair, published in Nature in 2008, we figured that genome—not just generating the data, but coming up with the analytical approach, which nobody had ever done before—probably cost $1.6M. If the cost of analysis doesn’t fall over time, we’re never going to get to clinical reality for a lot of these tests.”

This is not going to get any easier as the sequencers get more and more efficient; see Illumina’s announcement of the HiSeq 2500 (summary by OmicsOmics and on the SeqAnswers forum). And though the price of the reagents keeps decreasing it’s still cheaper to store the data than to re-sequence the samples, storage problems notwithstanding.

Open Science

Michael Eisen has a comment in the NYT on the Research Works Act that’s recommended reading. If you are a member of ISCB you might want to consider signing their policy statement which strongly opposes the act.


  • If you aren’t following Edge you are missing out on some great science debates. The Guardian talks to its founder, John Brockman
  • TopHat gets a new release
  • Aaron Kitzmiller has a terrific commentary regarding the Core model in academia, and why this incredibly difficult to get right for bioinformatics
  • St Jude’s releases Explore, a portal to their pediatric cancer genome data
  • Keith Bradnam summarizes why it is so difficult to evaluate genome assemblies
  • Dan Koboldt provides a neat summary of the current state of dbSNP

More soon!

Weekly snippets, holiday edition

Posted on 08, Jan, 2012 by in

Back in the office after the holidays and quite some catching up to do.

Growth in genomics

Coverage of computational biology and genomics in the general media continues to increase. This time the Economist covers bioinformatics and the New York Times has an article on computational biology and cancer. And while public funding for some of the current genome centers is cut by as much as 20 percent new centers in New York and Connecticut are hoping to benefit from the increase in demand.

Clinical grade sequencing

Some of this demand is driven by a general trend towards clinical sequencing which is benefiting the Broad and other centers. Given the inroads made to identify causal mutations (nicely summarized by MassGenomics for disorders and cancers) in 2011 this is perhaps not surprising, and even clinical trials for personal genome sequencing are kicking off.

While the technology is making rapid progress we still have to deal with a large number of problems, among them the handling of genomics and privacy
, how to make sense of the results in the first place — something that new initiatives like openSNP are trying to address — and the discrepancies caused by differences in sequencing technologies and data processing. We will need a public assessment or competition of workflows and methods, similar to what the Assemblathon and GAGE have been doing for genome assembly approaches.

Resources and discussions

See you in a week or two!