技术教育社区
www.teccses.org

Java自然语言处理-(影印版)

封面

作者:RichardMReese著

页数:237

出版社:东南大学出版社

出版日期:2016

ISBN:9787564160883

电子书格式:pdf/epub/txt

内容简介

《Java自然语言处理》将运用诸如全文检索、合适名称识别、聚类、标签、信息抽取和摘要等手段,展示如何自动组织文本。本书介绍了各种NLP概念,即便你没有任何统计学自然语言处理背景也能理解。

本书特色

自然语言处理(nlp)是应用开发中的重要领域之 一,其与解决当代问题的相关性将与日俱增。对于它 通过nlp任务支持实现的自然语言可访问应用的需求 已有显最增长。里斯编写的《java自然语言处理(影 印版)(英文版)》将运用诸如全文检索、合适名称识 别、聚类、标签、信息抽取和摘要等手段展示如何自 动组织文本。本书介绍了各种nlp概念,即便你没有 任何统计学自然语言处理背景也能理解。
  自然语言处理(nlp)是应用开发中的重要领域之 一,其与解决当代问题的相关性将与日俱增。对于它 通过nlp任务支持实现的自然语言可访问应用的需求 已有显最增长。里斯编写的《java自然语言处理(影 印版)(英文版)》将运用诸如全文检索、合适名称识 别、聚类、标签、信息抽取和摘要等手段展示如何自 动组织文本。本书介绍了各种nlp概念,即便你没有 任何统计学自然语言处理背景也能理解。

目录

prefacechapter 1: introduction to nlp what is nlp? why use nlp? why is nlp so hard? survey of nlp tools apache opennlp stanford nlp lingpipe gate uima overview of text processing tasks finding parts of text finding sentences finding people and things detecting parts of speech classifying text and documents extracting relationships using combined approaches understanding nlp models identifying the task selecting a model building and training the model verifying the model using the model preparing data summarychapter 2: finding parts of text understanding the parts of text what is tokenization? uses of tokenizers simple java tokenizers using the scanner class specifying the delimiter using the split method using the breaklterator class using the streamtokenizer class using the stringtokenizer class performance considerations with java core tokenization nlp tokenizer apis using the opennlptokenizer class using the simpletokenizer class using the whitespacetokenizer class using the tokenizerme class using the stanford tokenizer using the ptbtokenizer class using the documentpreprocessor class using a pipeline using lingpipe tokenizers training a tokenizer to find parts of text comparing tokenizers understanding normalization converting to lowercase removing stopwords creating a stopwords class using lingpipe to remove stopwords using stemming using the porter stemmer stemming with lingpipe using lemmatization using the stanfordlemmatizer class using lemmatization in opennlp normalizing using a pipeline summarychapter 3: finding sentences the sbd process what makes sbd difficult? understanding sbd rules of lingpipe’s heuristicsentencemodel class simple java sbds using regular expressions using the breaklterator class using nlp apis using opennlp using the sentencedetectorme class using the sentposdetect method using the stanford api using the ptbtokenizer class using the documentpreprocessor class using the stanfordcorenlp class using lingpipe using the indoeuropeansentencemodel class using the sentencechunker class using the medlinesentencemodel class training a sentence detector model using the trained model evaluating the model using the sentencedetectorevaluator class summarychapter 4: finding people and things why ner is difficult? techniques for name recognition lists and regular expressions statistical classifiers using regular expressions for ner using java’s regular expressions to find entities using lingpipe’s regexchunker class using nlp apis using opennlp for ner determining the accuracy of the entity using other entity types processing multiple entity types using the stanford api for ner using lingpipe for ner using lingpipe’s name entity models using the exactdictionarychunker class training a model evaluating a model summarychapter 5: detecting parts of speech the tagging process importance of pos taggers what makes pos difficult? using the nlp apis using opennlp pos taggers using the opennlp postaggerme class for pos taggers using opennlp chunking using the posdictionary class using stanford pos taggers using stanford maxenttagger using the maxenttagger class to tag textese using stanford pipeline to perform tagging using lingpipe pos taggers using the hmmdecoder class with bestfirst tags using the hmmdecoder class with nbest tags determining tag confidence with the hmmdecoder class training the opennlp posmodel summarychapter 6: classi ify_~g_ texts and documents how classification is used understanding sentiment analysis text classifying techniques using apis to classify text using opennlp training an opennlp classification model using documentcategorizerme to classify text using stanford api using the columndataclassifier class for classification using the stanford pipeline to perform sentiment analysis using lingpipe to classify text training text using the classified class using other training categories classifying text using lingpipe sentiment analysis using lingpipe language identification using lingpipe summarychapter 7: using parser to extract relationships relationship types understanding parse trees using extracted relationships extracting relationships using nlp apis using opennlp using the stanford api using the lexicalizedparser class using the treeprint class finding word dependencies using the grammaticalstructure class finding coreference resolution entities extracting relationships for a question-answer system finding the word dependencies determining the question type searching for the answer summarychapter 8: combined approaches preparing data using boilerpipe to extract text from html using poi to extract text from word documents using pdfbox to extract text from pdf documents pipelines using the stanford pipeline using multiple cores with the stanford pipeline creating a pipeline to search text summaryindex

下载地址

立即下载

(解压密码:www.teccses.org)

Article Title:《Java自然语言处理-(影印版)》
Article link:https://www.teccses.org/593586.html