Limit search to available items
Book Cover
E-book
Author Perkins, Jacob, author

Title Python 3 text processing with NLTK 3 cookbook : over 80 practical recipes on natural language processing techniques using Python's NLTK 3.0 / Jacob Perkins ; cover image by Faiz Fattohi
Edition Second edition
Published Birmingham, England : Packt Publishing Ltd, 2014
©2014

Copies

Description 1 online resource (304 pages) : illustrations
Contents Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Tokenizing Text and WordNet Basics; Introduction; Tokenizing text into sentences; Tokenizing sentences into words; Tokenizing sentences using regular expressions; Training a sentence tokenizer; Filtering stopwords in a tokenized sentence; Looking up Synsets for a word in WordNet; Looking up lemmas and synonyms in WordNet; Calculating WordNet Synset similarity; Discovering word collocations; Chapter 2: Replacing and Correcting Words; Introduction; Stemming words
Lemmatizing words with WordNetReplacing words matching regular expressions; Removing repeating characters; Spelling correction with Enchant; Replacing synonyms; Replacing negations with antonyms; Chapter 3: Creating Custom Corpora; Introduction; Setting up a custom corpus; Creating a wordlist corpus; Creating a part-of-speech tagged word corpus; Creating a chunked phrase corpus; Creating a categorized text corpus; Creating a categorized chunk corpus reader; Lazy corpus loading; Creating a custom corpus view; Creating a MongoDB-backed corpus reader; Corpus editing with file locking
Chapter 4: Part-of-speech TaggingIntroduction; Default tagging; Training a unigram part-of-speech tagger; Combining taggers with backoff tagging; Training and combining ngram taggers; Creating a model of likely word tags; Tagging with regular expressions; Affix tagging; Training a Brill tagger; Training the TnT tagger; Using WordNet for tagging; Tagging proper names; Classifier-based tagging; Training a tagger with NLTK-Trainer; Chapter 5: Extracting Chunks; Introduction; Chunking and chinking with regular expressions; Merging and splitting chunks with regular expressions
Expanding and removing chunks with regular expressionsPartial parsing with regular expressions; Training a tagger-based chunker; Classification-based chunking; Extracting named entities; Extracting proper noun chunks; Extracting location chunks; Training a named entity chunker; Training a chunker with NLTK-Trainer; Chapter 6: Transforming Chunks and Trees; Introduction; Filtering insignificant words from a sentence; Correcting verb forms; Swapping verb phrases; Swapping noun cardinals; Swapping infinitive phrases; Singularizing plural nouns; Chaining chunk transformations
Converting a chunk tree to textFlattening a deep tree; Creating a shallow tree; Converting tree labels; Chapter 7: Text Classification; Introduction; Bag of words feature extraction; Training a Naive Bayes classifier; Training a decision tree classifier; Training a maximum entropy classifier; Training scikit-learn classifiers; Measuring precision and recall of a classifier; Calculating high information words; Combining classifiers with voting; Classifying with multiple binary classifiers; Training a classifier with NLTK-Trainer; Chapter 8: Distributed Processing and Handling Large Datasets
Summary This book is intended for Python programmers interested in learning how to do natural language processing. Maybe you've learned the limits of regular expressions the hard way, or you've realized that human language cannot be deterministically parsed like a computer language. Perhaps you have more text than you know what to do with, and need automated ways to analyze and structure that text. This Cookbook will show you how to train and use statistical language models to process text in ways that are practically impossible with standard programming tools. A basic knowledge of Python and the basi
Notes "Quick answers to common problems"--Cover
Includes index
English
Online resource; title from PDF title page (ebrary, viewed September 2, 2014)
Subject Python (Computer program language)
Natural language processing (Computer science) -- Research
COMPUTERS -- Programming Languages -- Python.
Python (Computer program language)
Form Electronic book
Author Fattohi, Faiz, cover designer
ISBN 9781782167860
1782167862
1782167854
9781782167853