March 2015

25

Mar

Using Neural Networks to fit equations in R

 
Using Neural Networks to fit equations in R Introduction Quoting Wikipedia: "In machine learning and cognitive science, artificial neural networks (ANNs) are a family of statistical learning algorithms inspired by biological neural networks (the central nervous systems of animals, in...
19

Mar

The tale of two algorithms, importance of algorithm analysis in our daily programming

 
The tale of two algorithms, importance of algorithm analysis in our daily programming tasks   Introduction Quoting Wikipedia: "In computer science, the analysis of algorithms is the determination of the amount of resources (such as time and storage) necessary to execute them. Most...
17

Mar

What can we say about world fertility, life expectancy and population size?

 
 What can we say about world fertility, life expectancy and population size? Introduction The purpose of this post is not to "reinvent the wheel", but rather to be used for a playground with a somewhat difficult Data Science problem where data is highly correlated or not correlated at all...
15

Mar

Simple Bash command line to reduce the length of the FASTA header lines

 
Simple Bash command line to reduce the length of the FASTA header lines Introduction Hi there, how many times we have a FASTA file that contains huge FASTA headers like this: >gi|600513|gb|M21306.1|DROTRPC Drosophila melanogaster photoreceptor...
15

Mar

Handling large FASTA sequence datasets in R: Shuffle and retrieve “n” number of sequences of fixed length from the whole FASTA file and export them in a new FASTA file

 
Handling large FASTA sequence datasets in R: Shuffle and retrieve "n" number of sequences of fixed length from the whole FASTA file and export them in a new FASTA file Introduction When you are working with large FASTA datasets is likely to find out that the sequences are in sort of a mixed...
14

Mar

Extracting upstream regions of a RefSeq human gene list in R using Bioconductor

 
Extracting upstream regions of a RefSeq human gene list in R using Bioconductor Introduction Suppose that you want to do local mapping of upstream regions of a given RefSeq IDs in a particular genome in R using Bioconductor. Download the script here. In this case, you may take a look at...
14

Mar

Accuracy versus F score: Machine Learning for the RNA Polymerases

 
Accuracy versus F score: Machine Learning for the RNA Polymerases Introduction Hello, today I'm going to show you the difference of using two different common performance measures (useful not only for Machine Learning purposes, is useful in every scientific field). Until now, I have found...
14

Mar

Upgrade and update R 2.X to R 3.X in Debian Wheezy 7.X

 
Upgrade and update R 2.X to R 3.X in Debian Wheezy 7.X Introduction Following the instructions from CRAN, you need to add the R backports in your source list. FIRST PART: ADD R BACKPORTS: First, open a Terminal and open the sources.list file: $ gksudo gedit /etc/apt/sources.list Then,...
14

Mar

Introduction to Markov Chains and modeling DNA sequences in R

 
Introduction to Markov Chains and modelling DNA sequences in R Introduction Markov chains are probabilistic models which can be used for the modeling of sequences given a probability distribution and then, they are also very useful for the characterization of certain parts of a DNA or protein...