notebook This is my personal notebook ^_^

R notes

This is my personal note for learning R. A lot of contents come from online resources, such as Coursera and Datacamp.

Basic R, R markdown, and git

Basic R

If you know the basic programming skills, some cheat sheets will be quite useful for you. Here’s what I found:

R markdown

Rmarkdown is very useful tool for putting notes together. More references can be found at Rstudio rmarkdown reference and the official cheat sheet.

Github also has very useful markdown tutorial.

git

Create a new repository on the command line:

echo "# notes" >> README.md
git init
git add README.md
git commit -m "first commit"
git remote add origin https://github.com/xzenggit/notes.git
git push -u origin master

or push an existing repository from the command line

git remote add origin https://github.com/xzenggit/notes.git
git push -u origin master

Github’s guide for using git can be found here.

Some guides including github pages and other interesting stuff can also be found here.

The dplyr package provides very powerful functions for data manipulation:

  • filter() and slice()

  • arrange()

  • select() and rename()

  • distinct()

  • mutate() and transmute()

  • summarise()

  • sample_n() and sample_farc()

Here is a good introduction about dplyr.

For data.table, Datacamp has a good tutorial and cheat-sheet, which are pretty straight.

The basic principle of exploratory data analysis is to better understand the data. So you can use whatever tools you like to analyze the data, and get preliminary understanding of the data structure and distribution etc.

Typlical, people plot distribution of interesting variables, or cluster difference variables, depending on the problem.

ggplot is a great graphic tool to explore the data. Please see:

Regression and statistical inference

Regression and statistical inference are more complicated topics. Here are some resources:

Machine learing and big data

Machine learning and big data are really the future of data science. There are tons of tutorials and courses about this. Here’s what I find: