如何拟合一条曲线

October 13, 2019 in statistics

简单拟合一个线性模型 states <- as.data.frame(state.x77[,c("Murder", "Population", "Illiteracy", "Income", "Frost")]) fit <- lm(Murder ~ Population + Illiteracy + Income + Frost, data=states) #summary(fit) 线性模型假设的综合验证使用gvlma包中的gvlma函数验证模型的线性假设。gvlma函数由Pena和Slate ( 2006 )编写，能对线性模型假设进行综合验证，同时还能做偏斜度、峰度和异方差性的评价。换句话说，它给模型假设提供了一个单独的综合检验(通过/不通过)。 # Listing 8.8 - Global test of linear model assumptions library(gvlma) gvmodel <- gvlma(fit) summary(gvmodel) ## ## Call: ## lm(formula = Murder ~ Population + Illiteracy + Income + Frost, ## data = states) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.

9 Ensembling The Predictions

April 1, 2018 in machine learning

Load Package And Data load("../../data/craet_8.Rdata") library(tidyverse) library(caret) #Set Parallel Processing - Decrease computation time if (!require("doMC")) install.packages("doMC") library(doMC) registerDoMC(cores = 4) Train Multiple Models So now we have predictions from multiple individual models.To do this we had to run the train() function once for each model, store the models and pass it to the res library(caretEnsemble) # Stacking Algorithms - Run multiple algos in one call. trainControl <- trainControl(method="repeatedcv", number=10, repeats=3, savePredictions=TRUE, classProbs=TRUE) algorithmList <- c('rf', 'adaboost', 'earth', 'svmRadial') set.

8 How To Evaluate Performance Of Multiple Machine Learning Algorithms?

March 31, 2018 in machine learning

Load Package And Data load("../../data/craet_7.Rdata") library(tidyverse) library(caret) #Set Parallel Processing - Decrease computation time if (!require("doMC")) install.packages("doMC") library(doMC) registerDoMC(cores = 4) Caret provides the resamples() function where you can provide multiple machine learning models and collectively evaluate them Define the training control fitControl <- trainControl( method = 'cv', # k-fold cross validation number = 5, # number of folds savePredictions = 'final', # saves predictions for optimal tuning parameter classProbs = T, # should class probabilities be returned summaryFunction=twoClassSummary # results summary function ) train models set.

7 How To Do Hyperparameter Tuning

March 30, 2018 in machine learning

Load Package And Data load("../../data/craet_6.Rdata") library(tidyverse) library(caret) # Set Parallel Processing - Decrease computation time if (!require("doMC")) install.packages("doMC") library(doMC) registerDoMC(cores = 4) Hyper parameter tuning using tuneGrid Model Tuning Parameter Set Cross Validation Set Cross validation method can be one amongst: ‘boot’: Bootstrap sampling ‘boot632’: Bootstrap sampling with 63.2% bias correction applied ‘optimism_boot’: The optimism bootstrap estimator ‘boot_all’: All boot methods. ‘cv’: k-Fold cross validation ‘repeatedcv’: Repeated k-Fold cross validation ‘oob’: Out of Bag cross validation ‘LOOCV’: Leave one out cross validation ‘LGOCV’: Leave group out cross validation Training And Tuning

6 Training and Tuning the model

March 29, 2018 in machine learning

Load Package And Data Training 1. How to train the model and interpret the results? Once you have chosen an algorithm, building the model is fairly easy using the train() function train() does multiple other things like: Cross validating the model Tune the hyper parameters for optimal model performance Choose the optimal model based on a given evaluation metric Preprocess the predictors (what we did so far using preProcess()) 2.

5 How to do feature selection using recursive feature elimination

March 28, 2018 in machine learning

You might need a rigorous way to determine the important variables first before feeding them to the ML algorithm. This is important. A good choice of selecting the important features is the recursive feature elimination (RFE) RFE works in 3 broad steps: Step 1: Build a ML model on a training dataset and estimate the feature importances on the test dataset.（在确定自由度的情况下，评价变量在测试数据集中的重要性） Step 2: Keeping priority to the most important variables, iterate through by building models of given sizes.

4 How To Visualize The Importance Of Variables Using featurePlot

March 27, 2018 in machine learning

Load Package And Data load("../../data/craet_3-3.Rdata") library(tidyverse) library(caret) Q: How The Predictors Influence The Y 选择重要的变量: 通过观察在Y的分组下各个变量的分布情况一般有箱线图和密度图 box-plot featurePlot(x = trainData[, 1:18], y = trainData$Purchase, plot = "box",#"density" strip=strip.custom(par.strip.text=list(cex=.7)), scales = list(x = list(relation="free"), y = list(relation="free"))) Density featurePlot(x = trainData[, 1:18], y = trainData$Purchase, plot = "density", strip=strip.custom(par.strip.text=list(cex=.7)), scales = list(x = list(relation="free"), y = list(relation="free"))) save.image("../../data/craet_4.Rdata")

OLDER POSTS
page 1 of 2

如何拟合一条曲线

9 Ensembling The Predictions

8 How To Evaluate Performance Of Multiple Machine Learning Algorithms?

7 How To Do Hyperparameter Tuning

6 Training and Tuning the model

5 How to do feature selection using recursive feature elimination

4 How To Visualize The Importance Of Variables Using featurePlot

Jixing Liu

使用 R 输出格式化的 Excel

如何拟合一条曲线

努力后的失败，才是诚实的失败

蝇王

如何阅读大量的学术论文, 而不发疯？

多标签分类问题

新药研发

Deep Work

The Hello World Of Neural Network

使用 R 分析可视化你的 iPhone 健康 APP 数据