
作者:RajatMehta著
页数:13,392页
出版社:东南大学出版社
出版日期:2019
ISBN:9787564182878
电子书格式:pdf/epub/txt
内容简介
本书一开始先通过使用Java对大数据进行基本的统计分析,然后再讨论如分类、回归、聚类、集成等其他数据分析主题。它还涵盖了如推荐引擎、大规模图形分析、实时分析、深度学习等高级主题。书中涵盖了各种案例研究,例如tweet数据集的情绪分析、针对MovieLens数据集的推荐、电子商务数据集的客户细分、真实航班数据集的图表分析。这本书是使用Java实现大数据分析的端到端指南。Java如今已经是主流大数据环境(包括Hadoop)的事实语言。本书将教你如何使用产品友好的Java对大数据进行分析。
作者简介
拉贾特·梅塔 is a VP (technical architect) in technology at JP Morgan Chase in New York. He is a Sun certified Java developer and has worked on Java-related technologies for more than 16 years. His current role for the past few years heavily involves the use of a big data stack and running analytics on it. He is alsoa contributor to various open source projects that are available on his GitHub repository, and is also a frequent writer for dev magazines.
目录
Chapter 1:Big Data Analytics with Java
Why data analytics on big data?
Big data for analytics
Big data – a bigger pay package for Java developers
Basics of Hadoop – a Java sub-project
Distributed computing on Hadoop
HDFS concepts
Design and architecture of HDFS
Main components of HDFS
HDFS simple commands
Apache Spark
Concepts
Transformations
Actions
Spark Java API
Spark samples using Java 8
Loading data
Data operations – cleansing and munging
Analyzing data – count, projection, grouping, aggregation, and max/min
Actions on RDDs
Paired RDDs
Saving data
Collecting and printing results
Executing Spark programs on Hadoop
Apache Spark sub-projects
Spark machine learning modules
Mahout – a popular Java ML library
Deeplearning4j – a deep learning library
Summary
Chapter 2: First Steps in Data Analysis
Datasets
Data cleaning and munging
Basic analysis of data with Spark SQL
Building SparkConf and context
Dataframe and datasets
Load and parse data
Analyzing data – the Spark-SQL way
Spark SQL for data exploration and analytics
Market basket analysis – Apriori algorithm
Implementation of the Apriori algorithm in Apache Spark
Efficient market basket analysis using FP-Growth algorithm
Running FP-Growth on Apache Spark
Summary
Chapter 3: Data Visualization
Data visualization with Java JFreeChart
Using charts in big data analytics
Time Series chart
All India seasonal and annual average temperature series dataset
Simple single Time Series chart















