技术教育社区
www.teccses.org

R语言简洁建模(影印版)

封面

作者:MaxKuhn

页数:380

出版社:东南大学出版社

出版日期:2023

ISBN:9787576605907

电子书格式:pdf/epub/txt

内容简介

tidymodels一组用于建模和机器学习的R语言包。无论你是新手还是拥有多年建模经验,这本实践用书将为数据分析师、商业分析师和数据科学家展示tidymodels框架如何为你的工作提供一致灵活的方法。RStudio的工程师Max Kuhn和Julia Silge展示了通过专注于一种称为tidyverse的R方言来创建模型的方法。采用tidyverse原则的软件共享高层设计理念和低层语法及数据结构,因此学习生态系统的一部分有助于掌握下一部分。你会明白为什么tidymodels框架被人们广泛使用。

作者简介

马克斯·库恩(Max Kuhn),康涅狄格州格罗顿市辉瑞全球研发非临床统计部主任,在制药和诊断行业已有近20年应用预测模型的经验,他还是很多R包的作者。

目录

Preface

Part I. Introduction

1. Software for Modeling

Fundamentals for Modeling Software

Types of Models

Descriptive Models

Inferential Models

Predictive Models

Connections Between Types of Models

Some Terminology

How Does Modeling Fit into the Data Analysis Process?

Chapter Summary

2. A Tiflyverse Primer

Tidyverse Principles

Design for Humans

Reuse Existing Data Structures

Design for the Pipe and Functional Programming

Examples of Tidyverse Syntax

Chapter Summary

3. A Review of R Modeling Fundamentals

An Example

What Does the R Formula Do?

Why Tidiness Is Important for Modeling

Combining Base R Models and the Tidyverse

The tidymodels Metapackage

Chapter Summary

Part II. Modeling Basics

4. The Ames Housing Data

Exploring Features of Homes in Ames

Chapter Summary

5. Spending Our Data

Common Methods for Splitting Data

What About a Validation Set?

Multilevel Data

Other Considerations for a Data Budget

Chapter Summary

6. Fitting Models with parsnip

Create a Model

Use the Model Results

Make Predictions

parsnip-Extension Packages

Creating Model Specifications

Chapter Summary

7. A Model Workflow

Where Does the Model Begin and End?

Workflow Basics

Adding Raw Variables to the workflow0

How Does a workflow0 Use the Formula?

Tree-Based Models

Special Formulas and Inline Functions

Creating Multiple Workflows at Once

Evaluating the Test Set

Chapter Summary

8. Feature Engineering with Recipes

A Simple recipe() for the Ames Housing Data

Using Recipes

How Data Are Used by the recipe()

Examples of Steps

Encoding Qualitative Data in a Numeric Format

Interaction Terms

Spline Functions

Feature Extraction

Row Sampling Steps

General Transformations

Natural Language Processing

Skipping Steps for New Data

Tidy a recipe()

Column Roles

Chapter Summary

9. Judging Model Effectiveness

Performance Metrics and Inference

Regression Metrics

Binary Classification Metrics

Multiclass Classification Metrics

Chapter Summary

Part Ill. Tools for Creating Effective Models

10. Resampling for Evaluating Performance

The Resubstitution Approach

Resampling Methods

Cross-Validation

Repeated Cross-Validation

Leave-One-Out Cross-Validation

Monte Carlo Cross-Validation

Validation Sets

Bootstrapping

Rolling Forecasting Origin Resampling

Estimating Performance

Parallel Processing

Saving the Resampled Objects

Chapter Summary

11. Comparing Models with Resampling

Creating Multiple Models with Workflow Sets

Comparing Resampled Performance Statistics

Simple Hypothesis Testing Methods

Bayesian Methods

A Random Intercept Model

The Effect of the Amount of Resampling

Chapter Summary

12. Model Tuning and the Dangers of Overntting

Model Parameters

Tuning Parameters for Different Types of Models

What Do We Optimize?

The Consequences of Poor Parameter Estimates

Two General Strategies for Optimization

Tuning Parameters in tidymodels

Chapter Summary

13. Grid Search

Regular and Nonregular Grids

Regular Grids

Nonregular Grids

Evaluating the Grid

Finalizing the Model

Tools for Creating Tuning Specifications

Tools for Efficient Grid Search

Submodel Optimization

Parallel Processing

Benchmarking Boosted Trees

Access to Global Variables

Racing Methods

Chapter Summary

14. Iterative Search

A Support Vector Machine Model

Bayesian Optimization

A Gaussian Process Model

Acquisition Functions

The tune_bayes() Function

Simulated Annealing

Simulated An

下载地址

立即下载

(解压密码:www.teccses.org)

Article Title:《R语言简洁建模(影印版)》
Article link:https://www.teccses.org/1462593.html