Data Scientist - Benjamin Tovar

Who chat more, me of my GF? Message analysis in R




Who chat more, me of my GF? Message analysis in R

This post is dedicated to someone very special, my GF. Hope you like it ;).


Made a dump of six months (March to August, 2015) from our WhatsApp® conversation. Found trends about who chat more (number of messages), who takes more time to answer messages by month, day and weekday and finally plot wordclouds to show frequent words per author.


From 36,129 messages analysed, 19,550 were mine and 16,579 were from Anne, So I won this round (Ben:1, Anne:0).


Who takes longer to reply messages?. OK, definitely I took longer with an average in minutes of 8.0 and Anne with an average of 5.4. Anne won this round. (Ben:1, Anne:1).


Comparing number of messages and average minutes of difference between messages given the author and day of the month.

Looking at the top barplot,  it took me an average of 25 minutes to reply messages the 1st of each month analysed,  also I regularly tend to take longer to reply until the 3rd day, the pattern appears again in the last days of the month (30th to 31st). I am usually more busy these days (paying bills, rent, not cool bro, not cool). Anne on the contrast, the Delta values between days is more close to 0 than my Delta (this means that she’s more constant in the number of messages sent). Anne won this round. (Ben:1, Anne:2).

Exploring the bottom barplot, it looks like we send a similar number of messages per day, this is, taking for example day 2nd and 21st, the number of messages are different between these days, but within the same day, the number of messages among us is very similar (like a linear correlation). So, we are proportionally replying our messages. Draw round. (Ben:2, Anne:3).


Same analysis, but now comparing by month. Yep, I always took more time to reply messages but in my defence, I always send more of them. As these trends were already discussed, no extra points.


Same analysis, but now comparing by weekday. Conclusion is the same as above.


Additionally, Sundays usually are the days with more messages, but the difference is not very noticeable from the other weekdays.


Wordcloud: Top 2000 distinctive words per author

Despite the odds, yep, I say “amor” more frequently than you! (besides, your more distinctive words are “jajaja” >.< ), so, I won this round. (Ben:3, Anne:3).


Wordcloud: Top 2000 common words (regardless author)

No points for anyone. Our most common shared word is “que” (what in English).



Scoring board show 3 points for me and 3 points for Anne, so it is a draw ;). Hope you like the post.




Leave a comment

Your email address will not be published. Required fields are marked. *