![]() ![]() Regression Analysisįrankly, one of the most important analysis install.packages(c("stats","Lars","caret","survival","gam","glmnet", Regarding of type of the variable, type of the analysis, and results a statistician wants to get, there are list of packages that should be part of daily R environment, when it comes to statistical analysis. "survey","laeken","stratification","simPop"),ĭependencies = TRUE) 4. install.packages(c("sampling","icarus","sampler","SamplingStrata", But adding additional packages, that I have used: install.packages(c("stats","ggpubr","lme4","MASS","car"),ĭata sampling, working with samples and population, working with inference, weights, and type of statistical data sampling can be find in these brilliant packages, also including those that are great for surveying data. Which is great, because primarily R is a statistical language, and many of the tests are already included. Many of the statistical tests (Shapiro, T-test, Wilcox, equality, …) are available in base and stats package that are available with R engine. But following is a list of those most widely used in the R community and easy to maneuver data: install.packages(c("dplyr","tidyverse","purr","magrittr",ģ. There are many packages available to do the task of wrangling, engineering and aggregating, especially R package should not be overlooked, since it offers a lot of great and powerful features. Wrangling, subseting and aggregating data ![]() "scales","hablar","readr"), dependencies = TRUE) 2.3. List of the must have packages: install.packages(c("stringr","lubridate","glue", Working with correct data types and knowing your ways around handling formatting of your data-set can be overlooked and yet important. The list is by no means the complete list, but can be a good starting point: install.packages(c("janitor","outliers","missForest","frequency","Amelia", Cleaning dataĭata cleaning is essential for cleaning out all the outliers, NULL, N/A values, wrong values, doing imputation or replacing them, checking up frequencies and descriptive and applying different single-, bi-, and multi-variate statistical analysis to tackle this issue. Install.packages(c("RSQL","sqldf","poplite","queryparser"), dependencies = TRUE)ĭata Engineering, data copying, data wrangling and data manipulating data is the very next task in the journey. Install.packages(c("RSQLite","sqliter","dbflobr"), dependencies = TRUE) Install.packages(c("RRedshiftSQL"), dependencies = TRUE) Install.packages(c("ODBC"), dependencies = TRUE) Install.packages(c("postGIStools","RPostgreSQL"), dependencies = TRUE) Install.packages(c("RMySQL","dbConnect"), dependencies = TRUE) Install.packages(c("mssqlR", "RODBC"), dependencies = TRUE) In addition, I have added some useful R packages that will help you query data in R much easier (RSQL) or even directly write SQL Statements (sqldf) and other great features. This will cover most of the used work for ODBC drives: install.packages(c("odbc", "RODBC"), dependencies = TRUE)Īccessing SQL database with a particular package can also have great benefits when pulling data from database into R data frame. Install.packages("XML", dependencies = TRUE) Install.packages("arrow", dependencies = TRUE)ĭevtools::install_github("apache/arrow/r") Install.packages("sparkavro", dependencies = TRUE) Install.packages(c("jsonLite","rjson","RJSONIO","jsonvalidate"), dependencies = TRUE) Install.packages(c("csv","readr","tidyverse"), dependencies = TRUE) Install.packages(c("readxl","xlsx"), dependencies = TRUE) Install.packages(c("protr","foreign"), dependencies = TRUE) Install.packages("foreign", dependencies = TRUE) Install.packages("Hmisc", dependencies = TRUE) ![]() Importing from binary files # Reading from SAS and SPSS Data is the fuel.īreaking it into the further sections, reading data from binary files, from ODBC drivers and from SQL databases.ġ.1. Loading and read data into R environment is most likely one of the first steps if not the most important. And by no means, this is not a definite list, and only a personal preference. From the perspective of a statistician and data scientist, I will cover the essential and major packages in sections. Many useful functions are available in many different R packages, many of the same functionalities also in different packages, so it all boils down to user preferences and work, that one decides to use particular package. I have written couple of blog posts on R packages ( here | here ) and this blog post is sort of a preset of all the most needed packages for data science, statistical usage and every-day usage with R.Īmong thousand of R packages available on CRAN (with all the mirror sites) or Github and any developer’s repository. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |