Who chat more, me of my GF? Message analysis in R
This post is dedicated to someone very special, my GF. Hope you like it ;).
Made a dump of six months (March to August, 2015) from our WhatsApp® conversation. Found trends about who chat more (number of messages), who takes more...
Spam comment analysis in R
Imagine login into your blog and find out more than hundred spam messages, not cool!. I am not letting the spammers win, so I decided to crack some patterns and try to understand/learn something about these little annoying bots.
For this post, I am performing spam...
Introduction to text mining in R
I was checking some Machine Learning challenges at Hackerrank and found a particular challenge which consist on document classification. The source is over here. I downloaded the dataset and decided to make my own text mining analysis instead. The dataset...
Introduction to K-means in R
"k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation...
Using Neural Networks to fit equations in R
"In machine learning and cognitive science, artificial neural networks (ANNs) are a family of statistical learning algorithms inspired by biological neural networks (the central nervous systems of animals, in...
The tale of two algorithms, importance of algorithm analysis in our daily programming tasks
Quoting Wikipedia: "In computer science, the analysis of algorithms is the determination of the amount of resources (such as time and storage) necessary to execute them. Most...
What can we say about world fertility, life expectancy and population size?
The purpose of this post is not to "reinvent the wheel", but rather to be used for a playground with a somewhat difficult Data Science problem where data is highly correlated or not correlated at all...
Handling large FASTA sequence datasets in R: Shuffle and retrieve "n" number of sequences of fixed length from the whole FASTA file and export them in a new FASTA file
When you are working with large FASTA datasets is likely to find out that the sequences are in sort of a mixed...
Extracting upstream regions of a RefSeq human gene list in R using Bioconductor
Suppose that you want to do local mapping of upstream regions of a given RefSeq IDs in a particular genome in R using Bioconductor. Download the script here.
In this case, you may take a look at...
Upgrade and update R 2.X to R 3.X in Debian Wheezy 7.X
Following the instructions from CRAN, you need to add the R backports in your source list.
FIRST PART: ADD R BACKPORTS:
First, open a Terminal and open the sources.list file:
$ gksudo gedit /etc/apt/sources.list