How to access the Bluesky API from R, fast

It is a truth universally acknowledged that Elon Musk has ruined Twitter for everyone else, which is why everyone else is moving over to Bluesky. However, one specific way in which Twitter has been broken has created insufficient outrage: programmatically accessing Twitter, something that was taken for granted for a long time, has become nearly impossible, unless one is willing and able to pay seriously big dollars to the Edgelord. This affects even small hobbyist projects such as the Radical Right Research Robot, whose intrepid posting was suspended for a few months after a series of unwelcome and unpredictable changes at Musk HQ.

All the bot needs to do is post, and even this small grace could be withdrawn any day. Everything else, even stuff as trivial as looking at the list of its followers, is verboten under the free plan.

Conversely, the Bluesky API is wide open at the moment. And thanks to the work of Johannes Gruber, Benjamin Guinaudeau, and Fabio Votta, who have created the atrrr package, access to this API from within the R ecosystem is trivially easy.

How to access the Bluesky API from R, fast 2

While this opens the way for fancy social media analyses on this exciting new platform, I came across atrrr because I wanted to solve a much simpler problem: finding interesting follows amongst the relatively large number of new people who followed me over the last few days. More specifically, I want a list of my followers that a) I do not follow yet and b) that have some interesting keyword in their description. This can be done in eight lines even by a noob like me.

First, load the library and authorise your script using an app password. The command will even open the right submenu on Bluesky for you to generate a new app password if necessary. No need to create projects and apps a la dev.twitter.com

#install.packages("atrrr")
library(atrrr)
auth("your-handle-without-the-@")

Second, get two dataframes with a) accounts that follow you and b) accounts that you follow. Set the limit according to your needs (the default is just 25).

myfollowers <- get_followers(actor="your-handle-without-the-@",limit=4000)
myfollows <- get_follows(actor="your-handle-without-the-@",limit=4000)

In my case, this took just a few seconds for each data frame. These data frames contain the users’ unique identifiers (“dids”), handles, names, and descriptions. Using the identifiers, it is easy to filter out those not-yet-followed followers:

# find those of my followers who are not followed by me
not.followed.by.me <- myfollowers %>% dplyr::anti_join(myfollows,by="did")

After the recent influx of new users and resulting follow fest, I ended up with a longish list. But filtering for keywords in the description is, again, easy. Say you wanted to find fellow radical right nerds, because it’s only polite to follow them back? This does the trick.

# Filter out keyword e.g. "radical right"
candidates <- not.followed.by.me %>% filter(grepl("radical right",actor_description,ignore.case=TRUE)) 
# make a clickable URL
candidates$profile.url <- paste0("https://bsky.app/profile/",candidates$actor_handle)

The last line, which creates links to the users’ profiles, is not strictly necessary. One could simply loop over the handles and follow all of them automatically. But I would rather have a look before deciding what to do. Printing the column in the terminal creates a list of clickable links that will open in the browser for inspection.

And that’s it. Enjoy the package. Everything works really, really well.

57 thoughts on “How to access the Bluesky API from R, fast”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.