What do you do when you want to use results from the literature to anchor your own analysis? we’ll go through a practical scenario on scraping an html table from a Nature Genetics article into R and wrangling the data into a useful format. 01. Scraping a html table from a webpage #load packages library("rvest") library("knitr") library(tidyverse) #scraping web page url <- "https://www.nature.com/articles/ng.2802/tables/2" #====🔥find where is the table lives on this webpage==== table_path='//*[@id="content"]/div/div/figure/div[1]/div/div[1]/table' #get the table nature_genetics_table2 <- url %>% read_html() %>% html_nodes(xpath=table_path) %>% html_table(fill=T) %>% .

Continue reading

Author's picture

Jixing Liu

Reading And Writing

Data Scientist

China