Hochleistungsfamilienvaterpostdoktorand
53 stories
·
3 followers

“All oil-field workers are radiation workers.” How is it possible that the rules...

1 Share

“All oil-field workers are radiation workers.”

How is it possible that the rules about radiation exposure are so thourougly circumvented by the oil industry, like it’s no big deal?

Read the whole story
biocs
850 days ago
reply
Dresden, Germany
Share this story
Delete

The future of board games: malleable, personal, and finite ↦

1 Share

by Dan Moren

Slate has a nice piece on the changing future of board games, from things that you play endlessly to a more finite experience, and the designer who started the trend:

Designers like Rob Daviau are at the center of this change. In 2008, Daviau was working at the toy behemoth Hasbro, cranking out variations on classic games—a Harry Potter edition of Clue, Star Wars Trivial Pursuit, and countless version of Risk. One day, at a brainstorming session for Clue, he cracked a joke. “I don’t know why they keep inviting these people over,” he said. “They’re all murderers.”

I’ve been playing Daviau’s Pandemic Legacy with a few friends since earlier this year, and it’s just as good as the buzz says. Last night, during our latest session, we ended up FaceTiming one of our players who couldn’t make it, because we had to share with her the utter craziness that had just gone down.

Seafall, Daviau’s next title, might be among the most hotly anticipated games of all time. When I was at Gen Con earlier this month, one of my friends made a beeline to the booth for publisher Z-Man as soon as the doors to the exhibit hall opened, but they were already sold out of all their copies. (It doesn’t ship for real until later this fall.)

The current Pandemic Legacy is billed as “Season 1”, so I’m intrigued to see what happens when a Season 2—which they’re reportedly working on—comes down the pike.

[Read on Six Colors.]

Read the whole story
biocs
2100 days ago
reply
Dresden, Germany
Share this story
Delete

Oligomers of Heat-Shock Proteins: Structures That Don’t Imply Function

1 Share

by William M. Jacobs, Tuomas P. J. Knowles, Daan Frenkel

Most proteins must remain soluble in the cytosol in order to perform their biological functions. To protect against undesired protein aggregation, living cells maintain a population of molecular chaperones that ensure the solubility of the proteome. Here we report simulations of a lattice model of interacting proteins to understand how low concentrations of passive molecular chaperones, such as small heat-shock proteins, suppress thermodynamic instabilities in protein solutions. Given fixed concentrations of chaperones and client proteins, the solubility of the proteome can be increased by tuning the chaperone–client binding strength. Surprisingly, we find that the binding strength that optimizes solubility while preventing irreversible chaperone binding also promotes the formation of weakly bound chaperone oligomers, although the presence of these oligomers does not significantly affect the thermodynamic stability of the solution. Such oligomers are commonly observed in experiments on small heat-shock proteins, but their connection to the biological function of these chaperones has remained unclear. Our simulations suggest that this clustering may not have any essential biological function, but rather emerges as a natural side-effect of optimizing the thermodynamic stability of the proteome.
Read the whole story
biocs
2278 days ago
reply
Dresden, Germany
Share this story
Delete

Matt Taibbi on how America made Donald Trump unstoppable

1 Share
"Trump found the flaw in the American Death Star"  
Read the whole story
biocs
2282 days ago
reply
Dresden, Germany
Share this story
Delete

Prediction of Functionally Important Phospho-Regulatory Events in Xenopus laevis Oocytes

1 Share

by Jeffrey R. Johnson, Silvia D. Santos, Tasha Johnson, Ursula Pieper, Marta Strumillo, Omar Wagih, Andrej Sali, Nevan J. Krogan, Pedro Beltrao

The African clawed frog Xenopus laevis is an important model organism for studies in developmental and cell biology, including cell-signaling. However, our knowledge of X. laevis protein post-translational modifications remains scarce. Here, we used a mass spectrometry-based approach to survey the phosphoproteome of this species, compiling a list of 2636 phosphosites. We used structural information and phosphoproteomic data for 13 other species in order to predict functionally important phospho-regulatory events. We found that the degree of conservation of phosphosites across species is predictive of sites with known molecular function. In addition, we predicted kinase-protein interactions for a set of cell-cycle kinases across all species. The degree of conservation of kinase-protein interactions was found to be predictive of functionally relevant regulatory interactions. Finally, using comparative protein structure models, we find that phosphosites within structured domains tend to be located at positions with high conformational flexibility. Our analysis suggests that a small class of phosphosites occurs in positions that have the potential to regulate protein conformation.
Read the whole story
biocs
2464 days ago
reply
Dresden, Germany
Share this story
Delete

Near-optimal RNA-Seq quantification with kallisto

1 Share

Today I  Today nnovativein  posted the preprint N. Bray, H. Pimentel, P. Melsted and L. Pachter, Near-optimal RNA-Seq quantification with kallistotothe arXiv. It describes the RNA-Seq quantification program kallisto.

The project began in August 2013 when I wrote my second blog post, about another arXiv preprint describing a program for RNA-Seq quantification called Sailfish (now a published paper). At the time, a few students and postdocs in my group read the paper and then discussed it in our weekly journal club. It advocated a philosophy of “lightweight algorithms, which make frugal use of data, respect constant factors and effectively use concurrent hardware by working with small units of data where possible”. Indeed, two themes emerged in the journal club discussion:

1. Sailfish was much faster than other methods by virtue of beingsimpler.

2. The simplicity was to replace approximate alignment of reads with exact alignment of k-mers. When reads are shredded into their constituent k-mer “mini-reads”, the difficult read -> reference alignment problem in the presence of errors becomes an exact matching problem efficiently solvable with a hash table.

We felt that the shredding of reads must lead to reduced accuracy, and we quickly checked and found that to be the case. In fact, in our simulations, we saw that Sailfish significantly underperformed methods such as RSEM. However the fact that simpler was so much faster led us to wonder whether the prevailing wisdom of seeking to improve RNA-Seq analysis by looking at increasingly complex models was ill-founded. Perhaps simpler could be not only fast, but also accurate, or at least close enough to best-in-class for practical purposes.

After thinking about the problem carefully, my (now former) student Nicolas Bray realized that the key is to abandon the idea that alignments are necessary for RNA-Seq quantification. Even Sailfish makes use of alignments (of k-mers rather than reads, but alignments nonetheless). In fact, thinking about all the tools available, Nick realized that every RNA-Seq analysis program was being developed in the context of a “pipeline” of first aligning reads or parts of them to a reference genome or transcriptome. Nick had the insight to ask: what can be gained if we let go of that paradigm?

By April 2014 we had formalized the notion of “pseudoalignment” and Nick had written, in Python, a prototype of a pseudoaligner. He called the program kallisto. The basic idea was to determine, for each read, not where in each transcript it aligns, but rather which transcripts it is compatible with. That is asking for a lot less, and as it turns out, pseudoalignment can be muchfaster than alignment. At the same time, the information in pseudoalignments is enough to quantify abundances using a simple model for RNA-Seq, a point made in the isoEM paper, and an idea that Sailfish made use of as well.

Just how fast is pseudoalignment? In January of this year Páll Melsted from the University of Iceland came to visit my group for a semester sabbatical. Páll had experience in exactly the kinds of computer science we needed to optimize kallisto; he has written about efficient k-mer counting using the bloom filter and de Bruijn graph construction. He translated the Python kallisto to C++, incorporating numerous clever optimizations and a few new ideas along the way. His work was done in collaboration with my student Harold Pimentel, Nick (now a postdoc with Jacob Corn and Jennifer Doudna at the Innovative Genomics Initiative) and myself.

The screenshot below shows kallisto being used on my 2012 iMac desktop to build an index of the human transcriptome (5 min 8 sec), and then quantify 78.6 million GEUVADIShuman RNA-Seq reads (14 min). When we first saw these results we thought they were simply too good to be true. Let me repeat: The quantification of 78.6 million reads takes 14 minutes on a standard desktop using a single CPU core. In some tests we’ve gotten even faster running times, up to 15 million reads quantified per minute.

screenshot_kallisto

The results in our paper indicate that kallisto is not just fast, but also very accurate. This is not surprising: underlying RNA-Seq analysis are the alignments, and although kallisto is pseudoaligning instead, it is almost always only the compatibility information that is used in actual applications. As we show in our paper, from the point of view of compatibility, the pseudoalignments and alignments are almost the same.

Although accuracy is a primary concern with analysis, we realized in the course of working on kallisto that speed is also paramount, and not just as a  matter of convenience. The speed of kallisto has three major implications:

1. It allows for efficient bootstrapping. All that is required for the bootstrap are reruns of the EM algorithm, and those are particularly fast within kallisto. The result is that we can accurately estimate the uncertainty in abundance estimates. One of my favorite figures from our paper, made by Harold, is this one:

rainbow

It is based on an analysis of 40 samples of 30 million reads subsampled from 275 million rat RNA-Seq reads. Each dot corresponds to a transcript and is colored by its abundance. The x-axis shows the variance estimated from kallisto bootstraps on a single subsample while the y-axis shows the variance computed from the different subsamples of the data. We see that the bootstrap recapitulates the empirical variance. This result is non-trivial: the standard dogma, that the technical variance in RNA-Seq is “Poisson” (i.e. proportional to the mean) is false, as shown in Supplementary Figure 3 of our paper (the correlation becomes 0.64). Thus, the bootstrap will be invaluable when incorporated in downstream application and we are already working on some ideas.

2. It is not just the kallisto quantification that is fast; the index building, and even compilation of the program are also easy and quick. The implication for biologists is that RNA-Seq analysis now becomes interactive. Instead of “freezing” an analysis that might take weeks or even months, data can be explored dynamically, e.g. easily quantified against different transcriptomes, or re-quantified as transcriptomes are updated. The ability to analyze data locally instead of requiring cloud computation means that analysis is portable, and also easily secure.

3. We have found the fast turnaround of analysis helpful in improving the program itself. With kallisto we can quickly check the effect of changes in the algorithms. This allows for much faster debugging of problems, and also better optimization. It also allows us to release improvements knowing that users will be able to test them without resorting to a major computation that might take months. For this reason we’re not afraid to say that some improvements to kallisto will be coming soon.

As someone who has worked on RNA-Seq since the time of 32bp reads, I have to say that kallisto has personally been extremely liberating. It offers freedom from the bioinformatics core facility, freedom from the cloud, freedom from the multi-core server, and in my case freedom from my graduate students– for the first time in years I’m analyzing tons of data on my own; because of the simplicity and speed I find I have the time for it. Enjoy!


Filed under: papers, q-bio.QM, RNA-Seq
Read the whole story
biocs
2573 days ago
reply
Dresden, Germany
Share this story
Delete
Next Page of Stories