One R Package a Day (@RLangPackage on Twitter) is a Python-based Twitter bot deployed and scheduled to run on Heroku’s Free Tier. The purpose of the Twitter account is to feature the diversity of packages that R has to offer, especially the thousands of great packages waiting to be discovered by the casual R user. This project was created using a host of resources which are listed on my blog post introducing the account to the world.
View source code on GitHub at: https://github.com/StevenMMortimer/one-r-package-a-day
When writing another blog post I was struck by how many R packages there are (over 12,600 at the time of this writing). Immediately my mind thought as to how many that would be if you downloaded one package a day. The answer is 34+ years. I’m not sure I can religiously download one a day, but I could make a bot to inspire others to download a package a day. The basic pseudocode of the bot’s algorithm is the following:
I already had compiled a list of R packages with their name, description, URL, CRAN download count, and GitHub star count. By using this data, and updating it regularly, the script can pick from a menu of great R packages. To prevent the bot from tweeting about the same package over and over again, I randomly select rows from the list of packages after having scraped the account’s tweets and ensuring it hadn’t tweeted about the package before.
Heroku makes it a breeze to deploy this type of script and host it for free. Simply create an account and then create a Python app. I decided to link my GitHub repository with the app so that any code changes that are pushed to GitHub will trigger a new deployment on Heroku’s servers.
I found this easier that using Heroku CLI which would require me to push new changes
myself. In addition, you will need to configure your app’s environment by including
a requirements.txt
file and adding config vars for the Twitter credentials. The
requirements.txt
file is just a list of Python libraries that your app needs in
order to run. In this case it is python-dotenv, pandas, and twitterAPI.
These libraries will be installed on the virtual server when the script is executed.
Using config vars in Heroku is important so that the Twitter credentials will work
in the same way locally as they do on Heroku’s servers. I’m hosting the project’s
code on GitHub so I didn’t want to openly put my Twitter credentials in the script,
so I have a .env
file locally on my computer. On Heroku I have set up the config
vars with the same key value pairs. This way whenever the script runs it can read
these values from the .env
file locally or the app’s environment when it runs on
Heroku. I do recommend downloading the Heroku CLI just to be able to run and check
your script. I could trigger the script using the command:
heroku run -a YOUR-APP-NAME python YOUR-SCRIPT-NAME.py
and check the logs using the command:
heroku logs -a YOUR-APP-NAME --tail
Finally, the Twitter bot wouldn’t be very robotic if it didn’t tweet out everyday on its own. I used the Heroku Scheduler to automate the execution of the script each day. Heroku Scheduler has a really easy to use dashboard where you can add a job to execute the Python script at a set time.