[AISWorld] R Package "dicecrawler" for scraping data from Dice.com

Vlad Krotov vladkrotov at gmail.com
Wed Jul 19 17:42:11 EDT 2017


Hello,


My R package called “dicecrawler” has been published on CRAN:
https://cran.r-project.org/web/packages/dicecrawler/index.html


The package can be quite useful to IS researchers interested in various IT
workforce trends. It can save researchers days (if not months) of work by
eliminating the need for manual data collection!


The package is, in essence, a Web crawler for Dice (http://www.dice.com) –
a leading IT employment website. The function getjobs() automatically
crawls Dice, downloads job descriptions together with meta data, and
organizes the data in a “tidy data” format (see Wickham, 2014). The package
makes use of JobSearch API supplied by Dice:
http://www.dice.com/common/content/util/apidoc/jobsearch.html


One needs to write only one line of code to download thousands of job
descriptions from Dice. The data can then be analyzed using various
text-mining tools in R or saved to a file (e.g. Excel) and analyzed
manually as a part of a more traditional content analysis technique.


This package is distributed under the GPL-3 license:
https://cran.r-project.org/web/licenses/GPL-3


If you use this package in your research, please cite it as follows:


Vlad Krotov (2017). dicecrawler: Downloads Job Descriptions from Dice.com.
R package version 0.1.0. URL
https://cran.r-project.org/web/packages/dicecrawler/.


This is the first release of the package, so it is still a bit “raw”. If
you see any errors or would like to make suggestions for future releases –
please contact me at vkrotov at murraystate.edu


Thank you,

Vlad Krotov



More information about the AISWorld mailing list