RNA-seq Fragmentation Bias
While I was reading through RNA-seq protocols, I wondered what the rational is behind how protocols fragment cDNA. With some digging I found this nice paper by Mortazavi, Williams, et al. which gave some reasoning and accompanying investigation.
The rationale for using hydrolysis of RNA before random priming rather than fragmentation of cDNA at the next step was twofold. First, cDNA priming at putatively random sites, if fully successful, will over-represent 5′ ends of transcripts, and this uneven representation will have differing impact on RNAs of different sizes. Second, preliminary data strongly suggested that there are some strongly favored and disfavored sites of random priming, and we observed this in samples that were primed without hydrolysis. It did not, however, correspond to simple GC content bias (Supplementary Fig. 1b). We reasoned that, at room temperature, some RNA secondary structure may shield parts of transcripts from priming while favoring other sites. By fragmenting the RNA, we expected to reduce the amount of such secondary structure, though not completely eliminate it. RNA fragmentation before copying would also be expected to greatly reduce 5′ bias. This protocol gave better overall uniformity than protocols without RNA fragmentation (Supplementary Fig. 1), although some residual and reproducible nonuniformity clearly persists for randomly primed substrates that was not observed in other kinds of Illumina sequencing substrates handled simultaneously, such as chromatin immunoprecipitation sequencing (ChIPSeq) samples (for example, Supplementary Fig. 1c).
Obscure(ish) Definitions
Here are some terms that pop up enough in molecular biology and bioinformatics to confuse but not enough for a definition to be easily found (at least not by “I’m feeling lucky”‘ing on Google). I have tried to simplify the explanations enough to give intuition into its applications while still being easily accessible. I’ll update this as time goes on.
President Mural
While wandering DC I came across this awesome mural by Karla Rodas. She asked for a copy of the photos, so I thought I might as well upload them here while I was at it.
Terminal Keyboard Wizardry
The following are some things I have learned to be quicker with terminals while keeping the damage to my hands at a minimum. The tips focus around BASH and Vim keyboard shortcuts, but should have analogs in other environments. Read the rest of this entry »
PDF to OmniOutliner as Images
The goal is to get a PDF into OmniOutliner that subsequently allows for easy note taking. The best solution I could come up with is extracting each slide from the PDF and making it a separate bullet point in the outline. However, there surprisingly wasn’t any clearly documented steps for converting a PDF into a series of images. Preview only lets you do one page at a time, which isn’t an option when you are dealing with something like lecture slides. Going with the “teaching to fish” approach, the following steps will walk you through creating an application using Automator that will ask for a PDF and spew out an image of each slide/page: Read the rest of this entry »
Importing Google Bookmarks into Evernote
Evernote doesn’t support direct importing of Google Bookmarks. A little less than completely obvious workaround is to use delicious as an intermediate. Unfortunately the contents of the webpage won’t be in the note and the creation date is lost unlike when you clip it.
Read the rest of this entry »
Arrow Keys Don’t Work in Vim
I rediscovered this Vim/SSH issue while working on a server at work. Specifically, in normal mode arrow keys don’t do anything and in insert mode they type out characters instead of moving the cursor. Adding the following to my ~/.vimrc file solved the problem:
Read the rest of this entry »
Soulja Boy on Turning One’s Swag On
In the official movie trailer, Soulja Boy reminisces that he “came from the bottom” and how his “dreams for the top” were eventually realized through turning his “swag on”. I have translated his steps from “hopping out of bed” to “turning one’s swag on” to finally “gettin’ money” into a flow diagram for easy comprehension. I urge you to follow along:
Read the rest of this entry »
Suffix Trees and Suffix Arrays
Updated January 12, 2010: Added “A Taxonomy of Suffix Array Construction Algorithms” to “Further Reading” – an awesome, albeit very mathy, paper.
Suffix trees and suffix arrays are systems of organizing data that allows efficient string searches. This article aims to explain these ideas in layman’s terms. Because of this, I have shied away from mathematical notation and computer science notions unless doing so would make the explanation unnecessarily convoluted. I find carefully selected concrete examples the best way to gain intuition into algorithms, so I have used molecular biology to frame the discussion. Wikipedia’s great introduction to genetics entry should provide sufficient background information to follow along, although you should be good with high school level genetics. If you are looking for a detailed explanation of these concepts, take a look at the papers linked to at the bottom of this post under “Further Reading”.
Read the rest of this entry »
Batch cropping and tiling pictures
Batch cropping and tiling of pictures made easy with imagemagick. The following will crop all jpg files (*.jpg) in the current directory to 1080 by 500 starting at 200 to the right and 201 down from the top left.
Read the rest of this entry »