IDE
Integrated developer environment (e.g., RStudio)
A software application that brings together all the tools you may want to use in a single place.
R
R is a programming language used for statistical analysis, visualization, and other data analysis.
R is a programming language that can be used to perform tasks in every phase of the data analysis process. R can help you structure, organize, and clean your data and create detailed visuals and make dynamic documents.
R is typically used by professionals who have a statistical or research-oriented approach to solving problems; among them are scientists, statisticians, and engineers
A few advantages of R include its:
-Popularity: R is frequently used for data analysis
-Tools: R has a convenient library of ready-to-use tools for data cleaning and analysis
-Focus: R was created with statistics in mind; data analysts can conveniently use a rich library of statistical routines
-Adaptability: R adapts well for use in both machine learning and data analysis projects
-Availability: R is an open source programming language
What are the following languages used for?
-R
-Python
-HTML5
-CSS
-Swift
-Java
-C#
-Ruby
-PHP
-C++
-R: Offers statistical features for data analysis and is useful for creating advanced data visuals.
-Python: General purpose language that can be used to create what you need for data analysis.
-HTML5: Used by web designers to create structure for web pages and is used to connect to hosting platforms.
-CSS: Used for web page design and to control graphic elements (e.g., color, layout, and font) and page presentation on multiple devices (e.g., large screens, mobile screens, and printers).
-Swift: Used by mobile application developers to make apps run faster.
-Java: Official language for Android development (i.e., Android apps, etc.) Used by web application developers to create enterprise web applications that can run on multiple clients.
-C# (C sharp): Object-oriented language used to create mobile apps in the .NET open source developer platform. It is also used by game developers to create games.
-Ruby: General-purpose, object-oriented programming language used for web application development.
-PHP: Scripting language suited for web application development.
-C++: Extension of the C programming language that is used to create console games like those for Xbox.
Statistical Analysis
The science of collecting, exploring, and presenting large amounts of data to discover underlying patterns and trends.
R Console/Console Pane
Program window in R where you make use of the R programming language. It is an interface that lets you view, write, edit, and execute your R code.
The R console is a simple environment in which you can write single codes of R code. It won’t save your code beyond a single session, but it is valuable for running simple functions.
(RStudio is an IDE (interactive development environment) that build on the simplicity of the R console.)
RGui
RStudio’s Graphical User Interface
How to Save in RStudio
If you want to save the code you execute, it is better to save it in a text file or an .rmd file (which you will learn more about in upcoming lessons).
Note: Keep in mind that everything you write in the R Console disappears after you end your session (or close the console).
How to see instructions for how to cite R in a publication
Type citation() after the prompt and press Enter (Windows) or Return (Mac)
Packages
Packages are units of reproducible R code.
Members of the R community create packages to keep track of the R functions that they write and reuse.
Packages offer a helpful combination of code, reusable R functions, descriptive documentation, tests for checking your code, and sample data sets.
tidyverse
A collection of packages in R with a common design philosophy for data manipulation, exploration, and visualization.
For a lot of data analysts, the tidyverse is an essential tool.
Hot key/Key Bindings Cheat Sheet
Help>Keyboard Shortcuts Help
Open-source
Code that is freely available and may be modified and shared by the people who use it
To start a new file
File>New File>R Script
OR ctrl + shift + n
Base R
What you get after you install R. (The extra functionality comes from add-ons available from developers.)
How to install a package
How to install multiple packages at one
To install a package, you use the code install.packages(“package_name”, dependencies = TRUE)
OR
install.packages(“package_name”)
*Note: You can add the option dependencies = TRUE, which tells R to install the other things that are necessary for the package or packages to run smoothly. Otherwise, you may need to install additional packages to unlock the full functionality of a package.li
*Make sure to enter this script it in the console in the lower left-hand pane.
Or
Tools (on the top bar)>Install Packages>enter the package name (it will auto-complete the name if you don’t know the precise spelling)
To install multiple packages at once:
install.packages(c(“name1”, “name2”))
install.packages(c(“tidyverse”, “dslabs”))
How to load a package
To load a package, you use the code library(package_name).
Make sure to enter this script it in the console in the lower left-hand pane.
How to use a dataset from a package you’ve loaded
&
How to see that dataset
If you also want to use a dataset from a package you have loaded, then you use the code data(dataset_name).
To see the dataset, you can take the additional step of View(dataset_name).
What command do you use to see all of the packages you’ve installed?
installed.packages()
After a package has been installed, what command do we use to load the package every time we want to use it?
library(pkg_name)
If you try to load a package with library(blahblah) and get a message like Error in library(blahblah) : there is no package called ‘blahblah’, it means_________
you need to install that package first with install.packages().
How to tell if a dataset is currently loaded in your R environment
ls()
This function lists all objects currently in your global environment.
Hot Keys to Save a Script
Ctrl+S on Windows and Command+S on Mac
Hot keys to run an entire script
Ctrl+Shift+Enter on Windows Command
Hot keys to run a single line of script
Ctrl+Enter on Windows