R (programming language)
R is a free and open-source programming language primarily used for statistical computing and data analysis. Developed in the early 1990s by Ross Ihaka and Robert Gentleman as a teaching tool based on the earlier language S, R has since evolved into a versatile environment that supports various statistical procedures and data visualization techniques. Unlike commercial software options like SAS or SPSS, R allows users to mix different programming languages within a single project, enhancing its flexibility for complex analyses.
The R community is managed by a dedicated international core team that collaborates online to oversee its ongoing development. R’s ecosystem is enriched by numerous packages, which are collections of code that extend its functionality, accessible through the Comprehensive R Archive Network (CRAN). Although R is often perceived as having a steep learning curve, its dynamic capabilities make it a powerful tool for data scientists and statisticians.
The language has undergone several updates, with notable versions released in 2016 and 2020, and continues to be refined in the early 2020s. However, it is important to be aware that some versions of R have recently been found to contain vulnerabilities, highlighting the need for users to stay informed about security updates. Overall, R is regarded as a vital resource in the field of data analysis, appealing to both academic researchers and industry professionals alike.
On this Page
Subject Terms
R (programming language)
R is a free software package used in the analysis of data. Other software packages exist (such as SAS or SPSS), but they are company based. R is managed by a core team of expert individuals, with whom its scope is extremely broad. On the downside, however, learning how to work with R presents a steep learning curve.
Overview
In the early 1980s, the statistical programming language S was created. A commercial product based on S, called S-Plus, was released and enjoyed widespread success. Then Ross Ikaha and Robert Gentleman of the University of Auckland, New Zealand, wrote a simplified version of S to use as a teaching tool and named it R.
With R, it is always possible to do more on the results of a statistical procedure. Where R is based on a formal computer language, it is very dynamic in its uses. Analysis of even moderately complicated data is best approached via ad-hoc statistical model building, and here R comes into its own. In addition, R does not require everything to be written in its own language, and so it allows for a mix of languages to be used in a single program, depending on what language is best for the desired functionality.
In 1995, statistician Martin Mächler urged Ikaha and Gentleman to release the source code for R under the General Public License, which allows for its free use and distribution. At the same time, the Linux operating system promulgated an upsurge in open source software. R was a good fit for those who wished to use Linux for statistical programming. At this time, a forum was set up to discuss and explore bugs in the system and in the development of R.
By 1997 an international core team was created and had expanded. Its members collaborate via the Internet to control the development of R. As of 2024, there was a president, a vice president, a secretary general, a treasurer, an RF representative in R consortium, and four members at large. New members are evaluated based on nonmonetary contributions, such as new code and other efforts, to the R project.
An educational version of R was released on May 3, 2016, version 3.3.0. Version 4.0 of R was released on April 24, 2020. The language continued to receive updates throughout the early 2020s.
R is viewed by many as a statistics system, though it is better understood as providing an environment within which statistical procedures can be implemented. R’s uses can easily be extended by the use of packages, or sets of code that allow for new sets of tasks with set parameters. There are eight packages provided from the R distribution, while many more can be found through the Comprehensive R Archive Network (CRAN) family of Internet sites. In 2024, an exploit in the R programming languages was discovered. This flaw left some versions of R open to attacks from malicious actors.
Bibliography
Bloomfield, Victor. Using R for Numerical Analysis in Science and Engineering. Boca Raton: CRC, 2014. Print.
Crawley, Michael, J. Statistics: An Introduction using R. Medford: Wiley, 2014. Print.
Dayal, Vikram. An Introduction to R for Quantitative Economics: Graphing, Simulating and Computing. New York: Springer, 2015. Print.
Hothorn, Torsten, and B. S. Everitt. A Handbook of Statistical Analyses Using R. 3rd ed. Boca Raton: CRC, 2014. Print.
Lakshmanan, Ravie. "New R Programming Language Vulnerability Exposes Projects to Supply Chain Attacks." The Hacker News, 29 Apr. 2024, thehackernews.com/2024/04/new-r-programming-vulnerability-exposes.html. Accessed 15 Nov. 2024.
Machlis, Sharon. "Beginner’s Guide to R." Computerworld. Computerworld, 6 June 2013. Web. 22 Aug. 2016.
Nash, John C. Nonlinear Parameter Optimization Using R Tools. Medford: Wiley, 2014. Print.
Stowell, Sarah. Using R for Statistics. New York: Apress, 2014. Print.
Sun, Changyou. Empirical Research in Economics: Growing up with R. Starkville: Pine Square, 2015. Print.