技术教育社区
www.teccses.org

R语言机器学习参考手册-(影印版)

封面

作者:丘祐玮

页数:430

出版社:东南大学出版社

出版日期:2016

ISBN:9787564160630

电子书格式:pdf/epub/txt

内容简介

本书通过设置一个用户友好的编程环境并使用R语言进行数据ETL来介绍R语言基础知识。提供数据探索样例,以展示R语言的数据可视化和机器学习功能在探索隐含关系方面的强大能力。你将深入了解重要的机器学习主题,包括数据分类、回归、聚类、关联规则挖掘、降维等。

本书特色

r语言是一种强大的开源函数式编程语言。从本 质上看,r语言是一种统计编程语言,可以提供丰富 的工具用于分析数据并创建高级图形。
  丘祐玮所最的《r语言机器学习参考手册(影印版 )(英文版)》通过设置一个用户友好的编程环境并使 用r语言进行数据etl来介绍r语言基础知识。提供数 据探索样例,以展示r语言的数据可视化和机器学习 功能在探索隐含关系方面的强大能力。你将深入了解 重要的机器学习主题,包括数据分类、回归、聚类、 关联规则挖掘、降维等。

目录

prefacechapter 1: practical machine learning with r  introduction  downloading and installing r  downloading and installing rstudio  installing and loading packages  reading and writing data  using r to manipulate data  applying basic statistics  visualizing data  getting a dataset for machine learningchapter 2: data exploration with rms titanic  introduction  reading a titanic dataset from a csv file  converting types on character variables  detecting missing values  imputing missing values  exploring and visualizing data  predicting passenger survival with a decision tree  validating the power of prediction with a confusion matrix  assessing performance with the roc curvechapter 3: r and statistics  introduction  understanding data sampling in r  operating a probability distribution in r  working with univariate descriptive statistics in r  performing correlations and multivariate analysis  operating linear regression and multivariate analysis  conducting an exact binomial test  performing student’s t-test  performing the kolmogorov-smirnov test  understanding the wilcoxon rank sum and signed rank test  working with pearson’s chi-squared test  conducting a one-way anova  performing a two-way anovachapter 4: understanding regression analysis  introduction  fitting a linear regression model with im  summarizing linear model fits  using linear regression to predict unknown values  generating a diagnostic plot of a fitted model  fitting a polynomial regression model with im  fitting a robust linear regression model with rim  studying a case of linear regression on slid data  applying the gaussian model for generalized linear regression  applying the poisson model for generalized linear regression  applying the binomial model for generalized linear regression  fitting a generalized additive model to data  visualizing a generalized additive model  diagnosing a generalized additive modelchapter 5: classification (i) – tree, lazy, and probabilistic  introduction  preparing the training and testing datasets  building a classification model with recursive partitioning trees  visualizing a recursive partitioning tree  measuring the prediction performance of a recursive partitioning tree  pruning a recursive partitioning tree  building a classification model with a conditional inference tree  visualizing a conditional inference tree  measuring the prediction performance of a conditional inference tree  classifying data with the k-nearest neighbor classifier  classifying data with logistic regression  classifying data with the naive bayes classifierchapter 6: classification (ii) – neural network and svm  introduction  classifying data with a support vector machine  choosing the cost of a support vector machine  visualizing an svm fit  predicting labels based on a model trained by a support vector machine  tuning a support vector machine  training a neural network with neuralnet  visualizing a neural network trained by neuralnet  predicting labels based on a model trained by neuralnet  training a neural network with nnet  predicting labels based on a model trained by nnetchapter 7: model evaluation  introduction  estimating model performance with k-fold cross-validation  performing cross-validation with the e1071 package  performing cross-validation with the caret package  ranking the variable importance with the caret package  ranking the variable importance with the trainer package  finding highly correlated features with the caret package  selecting features using the caret package  measuringthe performance of the regression model  measuring prediction performance with a confusion matrix  measuring prediction performance using rocr  comparing an roc curve using the caret package  measuring performance differences between models with the caret packagechapter 8: ensemble learning  introduction  classifying data with the bagging method  performing cross-validation with the bagging method  classifying data with the boosting method  performing cross-validation with the boosting method  classifying data with gradient boosting  calculating the margins of a classifier  calculating the error evolution of the ensemble method  classifying data with random forest  estimating the prediction errors of different classifierschapter 9: clustering  introduction  clustering data with hierarchical clustering  cutting trees into clusters  clustering data with the k-means method  drawing a bivariate cluster plot  comparing clustering methods  extracting silhouette information from clustering  obtaining the optimum number of clusters for k-means  clustering data with the density-based method  clustering data with the model-based method  visualizing a dissimilarity matrix  validating clusters externallychapter 10: association analysis and sequence mining  introduction  transforming data into transactions  displaying transactions and associations  mining associations with the apriori rule  pruning redundant rules  visualizing association rules  mining frequent itemsets with eclat  creating transactions with temporal information  mining frequent sequential patterns with cspadechapter 11: dimension reduction  introduction  performing feature selection with fselector  performing dimension reduction with pca  determining the number of principal components using the scree test  determining the number of principal components using the kaiser method  visualizing multivariate data using biplot  performing dimension reduction with mds  reducing dimensions with svd  compressing images with svd  performing nonlinear dimension reduction with isomap  performing nonlinear dimension reduction with local linear embeddingchapter 12: big data analysis(r and hadoop)  introduction  preparing the rhadoop environment  installing rmr2  installing rhdfs  operating hdfs with rhdfs  implementing a word count problem with rhadoop  comparing the performance between an r mapreduce program and a standard r program  testing and debugging the rmr2 program  installing plyrmr  manipulating data with plyrmr  conducting machine learning with rhadoop  configuring rhadoop clusters on amazon emrappendix a: resources for r and machine learningappendix b: dataset – survival of passengers on the titanicindex

下载地址

立即下载

(解压密码:www.teccses.org)

Article Title:《R语言机器学习参考手册-(影印版)》
Article link:https://www.teccses.org/585368.html