Some View From Hadley Pragmatically, if you’re a data scientist, learning the basics of SQL is really important. You should also have a minimal reading knowledge of R and Python, because so many data science teams use both . Then I think you’re better off specializing in one of these two and getting really good at it, rather than spreading yourself too thin and being mediocre at several languages.

Continue reading

R vs python

Representing Data in R – Python equivalent import pandas as pd import numpy as np # 'characters' is equivalent to string firstName = 'jeff' print((type(firstName), firstName)) <type 'str'> jeff # 'numeric' is equivalent to float heightCM = 188.2 print((type(heightCM), heightCM)) <type 'float'> 188.2 # integer is equivalent to integer numberSons = 1 print((type(numberSons), numberSons)) <type 'int'> 1 # 'logical' is equivalent to Boolean teachingCoursera = True print((type(teachingCoursera), teachingCoursera)) <type 'bool'> True # 'vectors' is equivalent to numpy array or Python list (I will use array everywhere for consistency) heights = np.

Continue reading

Q: I has many separate tables that need to be combined into a single file? google search “R read many datasets or tables” Three steps: Getting a list of files path to read Write a function to read a file Then loop it step01: list all files path library(here) allfiles = list.files(path = here("data"), #Use the ⭐here package to indicate the directory the files are in relative to the root directory pattern = "AB.

Continue reading

Author's picture

Jixing Liu

Reading And Writing

Data Scientist

China