0% found this document useful (0 votes)
36 views

Natural Language Processing

Uploaded by

Vicky Nagar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Natural Language Processing

Uploaded by

Vicky Nagar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Understanding and replying the human

language

Overview

NLP is a part of Computer science and


Artificial Intelligence which deals Human
language

1.Natural Language Processing


(NLU) is a branch of artificial intelligence (AI)
that uses computer software to understand
Natural Language Understanding
input made in the form of sentences in text
or speech format.

Branch

Natural-language generation is a software


process that transforms structured data into
natural language. It can be used to produce
Natural Language Generation
long form content for organizations to
automate custom reports, as well as produce
custom content for a web or mobile

Example

What’s the time now?


to
What’s
Breaking strings into small individual Tokens
1 Tokenization The
or words
Time
now

Example

Playing , played, plays


Normalize words into its base form or its root
2 Stemming to
form
play

Example

Grouping different inflected form of word


Better, super
called Lemma
3 Lemmatization to
Good
Similar to Stemming, but returns perfect word

Example

2. NLP Terminology
Google about pantech solutions
The most popular POS tagging would be
4 POS – parts of speech identifying words as nouns, verbs,
adjectives, etc.
Google may be Noun/Verb

Example

Natural
Language 5 Named entity recognition
Recognizing the words as movie, monetary
values, organization, location, quantity or
Google about pantech solutions

Google – verb
Processing person.
Pantech solutions - organization

Example

Google
About
Picking pieces of words and form into Pantech
6 Chunking
phrases Solutions
to
Google about pantech solutions

This toolkit is one of the most powerful NLP


libraries which contains packages to make
machines understand human language and
reply to it with an appropriate response.

3. Natural Language Toolkit -


NLTK

Tokenization, Stemming, Lemmatization,


Punctuation, Character count, word count
are some of these packages which will be
discussed in this tutorial.

pip install nltk

4. Install NLTK

Convert a collection of text documents to a


CountVectorizer
matrix of token counts

Convert a collection of text documents to a


HashingVectorizer
matrix of token occurrences

Term-frequency times inverse document-


frequency
5.Feature extraction in Text

Convert a collection of raw documents to a


Overview
matrix of TF-IDF features

Equivalent to CountVectorizer followed by


TfidfTransformer

TfidfVectorizer

TfidfVectorizer. get_feature_names

DATA= [
'This is the first document.',
'This document is the second
document.',
'And this is the third one.',
Example
'Is this the first document?']
X = vectorizer.fit_transform(DATA)
TfidfVectorizer. get_feature_names

['and', 'document', 'first', 'is', 'one', 'second', '


the', 'third', 'this']

You might also like