Data Break

How a leftist twitter trend played out

Aug 8, 2020

A twitter trend took over on August 4-6th wherein people were dropping the convention of not “following back” people back and essentially following every person using the hashtag #nocomradeunder1k. I decided to pull out the rtweet package to see how the trend was faring after a day: library(rtweet) comrade_df <- search_tweets("#NoComradesUnder1k", n = 100000, retryonratelimit = T, parse = T, include_rts = F) I had gained quite a few followers (250 in 5 hours vs 500 over multiple years on another account), and was wondering what that twitter network looked like, where people were using the hashtag, and whether or not I could automate a process to give my new followers a gift of their own network / friends (e.

Catching up with dplyr 1.0 with fivethirtyeight data

Jul 7, 2020

code.sourceCode span { display: inline-block; line-height: 1.25; } code.sourceCode span { color: inherit; text-decoration: inherit; } code.sourceCode span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode { white-space: pre; position: relative; } div.sourceCode { margin: 1em 0; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { code.sourceCode { white-space: pre-wrap; } code.sourceCode span { text-indent: -5em; padding-left: 5em; } } pre.

Making Beautiful Streetmaps with ggplot2

Nov 11, 2019

Since an undergrad I’ve found it difficult to create a really nice streetmap. They always tend to look cluttered, line widths for different streetypes are always a challenge…the list goes on. I found this amazing post (and site) created by Christian Burkhart which gives some great tips on graphic design and data visualization. This post is largely a walk-through of the process he uses. if(!require("pacman")) install.packages("pacman") pacman::p_load("osmdata", "tidyverse","sf") Following the tutorial, I can extract the data for my city (Montreal) using the osmdata package:

Makeover Monday Data Visualization (Montreal dataviz meetup)

Nov 11, 2019

As a part of a Montreal-based meetup group that I’ve attended a few times (far less than I would like), I’m producing here a reproduction of a “Makeover Monday” data vizualization, of a UNESCO dataset found here. I’ll be using a few packages for this project: library(pacman) p_load("reactable", "tidyverse", "readxl", "DataExplorer") Based on the visualization from UNESCO, it looks like some countries may be excluded from the dataset.

Use dplyr like its meant to be used

Jun 6, 2019

After finding myself going back to some previous projects a few times to review some very useful lines of lesser-known dplyr functions, I decided I should write them both into the eternal bottomless pit that is web-blogging. Using mutate_at and case_when I love this example. I found myself constantly repeating case_when() lines within a mutate() to change variables based on names, and knew there had to be a better way. I’m sure it could be neater, but until I make it more efficient, this is what I have:

Scrape it 'til you make it

Jun 6, 2019

The other night I was reading and kept seeing some very interesting lines in the text, and I thought, “is it possible to identify the more quote-worth sentence(s) from a text?”. Realizing that this is a pretty big question, I decided to tone it down a bit and ask a more reasonable question - could I create a database of quotes by some of my favourite authors? Which led me to ask…could I create a database searching tool in a shiny app so anyone could check out quotes of their favourite author?

MAPS Crowdsourced Open Science Project: Data Imputation

Jun 6, 2019

Missing data can cause a compromise in inferences made from clinicial trials, and the mechanism (or reason) why the data is missing in the first place implicates whether or not an analytic method can be used to correct that missingness at all. There are three mechanisms which can cause missing data: missing completely at random (MCAR) missing at random (MAR), or and missing not at random (MNAR) Jakobsen, Gluud, Wetterslev & Winkel (2017).

How to render parameterized pdfs in rmarkdown

May 5, 2019

code.sourceCode span { display: inline-block; line-height: 1.25; } code.sourceCode span { color: inherit; text-decoration: inherit; } code.sourceCode span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode { white-space: pre; position: relative; } div.sourceCode { margin: 1em 0; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { code.sourceCode { white-space: pre-wrap; } code.sourceCode span { text-indent: -5em; padding-left: 5em; } } pre.

MAPS Crowdsourced Open Science Project: Data Exploration

Apr 4, 2019

code.sourceCode span { display: inline-block; line-height: 1.25; } code.sourceCode span { color: inherit; text-decoration: inherit; } code.sourceCode span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode { white-space: pre; position: relative; } div.sourceCode { margin: 1em 0; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { code.sourceCode { white-space: pre-wrap; } code.sourceCode span { text-indent: -5em; padding-left: 5em; } } pre.

OpenNorth Team Launches Contribution to University of Bristol MAPS crowdsourced analysis

Apr 4, 2019

My colleagues and I here at OpenNorth are contributing to a very interesting crowd-sourcing data analysis project out of the University of Bristol’s Jean Golding Institute, mapping the analytical paths of a crowdsourced data analysis, or MAPS. 1/ Call for collaborators - #MAPS project.@CO90s @ukrepro and @JGIBristol invite researchers of any level and any area to analyse the @CO90s data to answer this question: 'Is computer use during weekdays and weekends at 16 years old associated with depression at 18 years old?

How a leftist twitter trend played out

Catching up with dplyr 1.0 with fivethirtyeight data

Making Beautiful Streetmaps with ggplot2

Makeover Monday Data Visualization (Montreal dataviz meetup)

Use dplyr like its meant to be used

Scrape it 'til you make it

MAPS Crowdsourced Open Science Project: Data Imputation

Corey Pembleton

How a leftist twitter trend played out

Catching up with dplyr 1.0 with fivethirtyeight data

Making Beautiful Streetmaps with ggplot2

Makeover Monday Data Visualization (Montreal dataviz meetup)

Use dplyr like its meant to be used

Scrape it 'til you make it

MAPS Crowdsourced Open Science Project: Data Imputation

How to render parameterized pdfs in rmarkdown

MAPS Crowdsourced Open Science Project: Data Exploration

OpenNorth Team Launches Contribution to University of Bristol MAPS crowdsourced analysis