- Information
- AI Chat
NLP Unit 3 230603 100859
BE IT (2019) (414442)
Savitribai Phule Pune University
Recommended for you
Preview text
Unit No. 3
Semantic Analysis
Introduction to Semantic Analysis
####### Semantic Analysis is a subfield of Natural Language Processing (NLP)
####### that attempts to understand the meaning of Natural Language. Understanding
####### Natural Language might seem a straightforward process to us as humans.
####### However, due to the vast complexity and subjectivity involved in human
####### language, interpreting it is quite a complicated task for machines. Semantic
####### Analysis of Natural Language captures the meaning of the given text while
####### taking into account context, logical structuring of sentences and grammar roles.
####### Parts of Semantic Analysis
####### Semantic Analysis of Natural Language can be classified into two broad parts:
####### 1. Lexical Semantic Analysis: Lexical Semantic Analysis involves
####### understanding the meaning of each word of the text individually. It basically
####### refers to fetching the dictionary meaning that a word in the text is deputed to
####### carry.
####### 2. Compositional Semantics Analysis: Although knowing the meaning of each
####### word of the text is essential, it is not sufficient to completely understand the
####### meaning of the text.
####### For example, consider the following two sentences:
- Sentence 1: Students love GeeksforGeeks.
- Sentence 2: GeeksforGeeks loves Students.
####### Although both these sentences 1 and 2 use the same set of root words {student,
####### love, geeksforgeeks}, they convey entirely different meanings.
####### Hence, under Compositional Semantics Analysis, we try to understand how
####### combinations of individual words form the meaning of the text.
####### Tasks involved in Semantic Analysis
####### In order to understand the meaning of a sentence, the following are the major
####### processes involved in Semantic Analysis:
####### 1. Word Sense Disambiguation
####### 2. Relationship Extraction
####### Word Sense Disambiguation:
####### In Natural Language, the meaning of a word may vary as per its usage in
####### sentences and the context of the text. Word Sense Disambiguation involves
####### interpreting the meaning of a word based upon the context of its occurrence in
####### a text.
####### For example, the word ‘Bark’ may mean ‘the sound made by a dog’ or ‘the
####### outermost layer of a tree.’
####### Likewise, the word ‘rock’ may mean ‘a stone‘ or ‘a genre of music‘ – hence,
####### the accurate meaning of the word is highly dependent upon its context and usage
####### in the text.
####### Thus, the ability of a machine to overcome the ambiguity involved in identifying
####### the meaning of a word based on its usage and context is called Word Sense
####### Disambiguation.
####### Relationship Extraction:
####### Another important task involved in Semantic Analysis is Relationship
####### Extracting. It involves firstly identifying various entities present in the sentence
####### and then extracting the relationships between those entities.
- Antonymy: Antonymy refers to a pair of lexical terms that have
####### contrasting meanings – they are symmetric to a semantic axis. For
####### example: (Day, Night), (Hot, Cold), (Large, Small).
- Polysemy: Polysemy refers to lexical terms that have the same
####### spelling but multiple closely related meanings. It differs from
####### homonymy because the meanings of the terms need not be closely
####### related in the case of homonymy. For example: ‘man‘ may mean ‘the
####### human species‘ or ‘a male human‘ or ‘an adult male human‘ – since
####### all these different meanings bear a close association, the lexical term
####### ‘man‘ is a polysemy.
- Meronomy: Meronomy refers to a relationship wherein one lexical
####### term is a constituent of some larger entity. For example: ‘Wheel‘ is a
####### meronym of ‘Automobile‘
####### Meaning Representation
####### While, as humans, it is pretty simple for us to understand the meaning of
####### textual information, it is not so in the case of machines. Thus, machines tend
####### to represent the text in specific formats in order to interpret its meaning. This
####### formal structure that is used to understand the meaning of a text is called
####### meaning representation.
####### Basic Units of Semantic System:
####### In order to accomplish Meaning Representation in Semantic Analysis, it is
####### vital to understand the building units of such representations. The basic units
####### of semantic systems are explained below:
####### 1. Entity: An entity refers to a particular unit or individual in specific
####### such as a person or a location. For example GeeksforGeeks, Delhi,
####### etc.
####### 2. Concept: A Concept may be understood as a generalization of
####### entities. It refers to a broad class of individual units. For example
####### Learning Portals, City, Students.
####### 3. Relations: Relations help establish relationships between various
####### entities and concepts. For example: ‘GeeksforGeeks is a Learning
####### Portal’, ‘Delhi is a City.’, etc.
####### 4. Predicate: Predicates represent the verb structures of the sentences.
####### In Meaning Representation, we employ these basic units to represent textual
####### information.
####### Approaches to Meaning Representations:
####### Now that we are familiar with the basic understanding of Meaning
####### Representations, here are some of the most popular approaches to meaning
####### representation:
####### 1. First-order predicate logic (FOPL)
####### 2. Semantic Nets
####### 3. Frames
####### 4. Conceptual dependency (CD)
####### 5. Rule-based architecture
####### 6. Case Grammar
####### 7. Conceptual Graphs
####### Semantic Analysis Techniques
####### Based upon the end goal one is trying to accomplish, Semantic Analysis can
####### be used in various ways. Two of the most common Semantic Analysis
####### techniques are:
####### Text Classification
####### In-Text Classification, our aim is to label the text according to the insights we
####### intend to gain from the textual data.
####### Besides, Semantics Analysis is also widely employed to facilitate the
####### processes of automated answering systems such as chatbots – that answer user
####### queries without any human interventions.
- Ambiguity
####### Ambiguity in computational linguistics is a situation where a word or a sentence
####### may have more than one meaning. That is, a sentence may be interpreted in more
####### than one way. This leads to uncertainty in choosing the right meaning of a
####### sentence especially while processing natural languages by computer.
- Ambiguity is a challenging task in natural language understanding (NLU).
- The process of handling the ambiguity is called as disambiguation.
- Ambiguity presents in almost all the steps of natural language processing.
####### (Steps of NLP – lexical analysis, syntactic analysis, semantic analysis,
####### discourse analysis, and pragmatic analysis).
####### Consider the following sentence for an example;
####### “Raj tried to reach his friend on the mobile, but he didn’t attend”
####### In this sentence, we have the presence of lexical, syntactic, and anaphoric
####### ambiguities.
####### 1. Lexical ambiguity – The word “tried” means “attempted” not “judged”
####### or “tested”. Also, the word “reach” means “establish communication”
####### not “gain” or “pass” or “strive”.
####### 2. Syntactic ambiguity – The phrase “on the mobile” attached to “reach”
####### and thus means “using the mobile”. It is not attached to “friend”.
####### 3. Anaphoric ambiguity – The anaphor “he” refers the “friend” not “Raj”.
####### The following are the types of ambiguity with respect to natural language
####### processing task;
####### 1. Lexical ambiguity
####### It is class of ambiguity caused by a word and its multiple senses especially when
####### the word is part of sentence or phrase. A word can have multiple meanings under
####### different part of speech categories. Also, under each POS category they may have
####### multiple different senses. Lexical ambiguity is about choosing which sense of a
####### particular word under a particular POS category.
####### In a sentence, the lexical ambiguity is caused while choosing the right sense of a
####### word under a correct POS category.
####### For example, let us take the sentence “I saw a ship”. Here, the words “saw” and
####### “ship” would mean multiple things as follows;
####### Saw = present tense of the verb saw (cut with a saw) OR past tense of the verb
####### see (perceive by sight) OR a noun saw (blade for cutting) etc. According to
####### WordNet, the word “saw” is defined under 3 different senses in NOUN category
####### and under 25 different senses in VERB category.
####### Ship = present tense of the verb ship (transport commercially) OR present tense
####### of the verb ship (travel by ship) OR a noun ship (a vessel that carries passengers)
####### etc. As per WordNet, the word “ship” is defined with 1 sense under NOUN
####### category and 5 senses under VERB category.
####### Due to multiple meanings, there arises an ambiguity in choosing the right sense
####### of “saw” and “ship”.
####### Handling lexical ambiguity
####### Lexical ambiguity can be handled using the tasks like POS tagging and Word
####### Sense Disambiguation.
Test Corpus
Another input required by WSD is the high-annotated test corpus that has the target or correct- senses. The test corpora can be of two types &minsu; - Lexical sample − This kind of corpora is used in the system, where it is required to disambiguate a small sample of words. - All-words − This kind of corpora is used in the system, where it is expected to disambiguate all the words in a piece of running text.
❖ Approaches and Methods to Word Sense Disambiguation
(WSD)
Approaches and methods to WSD are classified according to the source of knowledge used in word disambiguation. Let us now see the four conventional methods to WSD −
Dictionary-based or Knowledge-based Methods
As the name suggests, for disambiguation, these methods primarily rely on dictionaries, treasures and lexical knowledge base. They do not use corpora evidences for disambiguation. The Lesk method is the seminal dictionary-based method introduced by Michael Lesk in 1986. The Lesk definition, on which the Lesk algorithm is based is “measure overlap between sense definitions for all words in context”. However, in 2000, Kilgarriff and Rosensweig gave the simplified Lesk definition as “measure overlap between sense definitions of word and current context”, which further means identify the correct sense for one word at a time. Here the current context is the set of words in surrounding sentence or paragraph.
Supervised Methods
For disambiguation, machine learning methods make use of sense-annotated corpora to train. These methods assume that the context can provide enough evidence on its own to disambiguate the sense. In these methods, the words knowledge and reasoning are deemed unnecessary. The context is represented as a set of “features” of the words. It includes the information about the surrounding words also. Support vector machine and memory-based
learning are the most successful supervised learning approaches to WSD. These methods rely on substantial amount of manually sense-tagged corpora, which is very expensive to create.
Semi-supervised Methods
Due to the lack of training corpus, most of the word sense disambiguation algorithms use semi- supervised learning methods. It is because semi-supervised methods use both labelled as well as unlabeled data. These methods require very small amount of annotated text and large amount of plain unannotated text. The technique that is used by semisupervised methods is bootstrapping from seed data.
Unsupervised Methods
These methods assume that similar senses occur in similar context. That is why the senses can be induced from text by clustering word occurrences by using some measure of similarity of the context. This task is called word sense induction or discrimination. Unsupervised methods have great potential to overcome the knowledge acquisition bottleneck due to non-dependency on manual efforts.
❖ Applications of Word Sense Disambiguation (WSD)
Word sense disambiguation (WSD) is applied in almost every application of language technology. Let us now see the scope of WSD −
Machine Translation
Machine translation or MT is the most obvious application of WSD. In MT, Lexical choice for the words that have distinct translations for different senses, is done by WSD. The senses in MT are represented as words in the target language. Most of the machine translation systems do not use explicit WSD module.
Information Retrieval (IR)
Information retrieval (IR) may be defined as a software program that deals with the organization, storage, retrieval and evaluation of information from document repositories particularly textual information. The system basically assists users in finding the information they required but it does not explicitly return the answers of the questions. WSD is used to
Word-sense discreteness
Another difficulty in WSD is that words cannot be easily divided into discrete submeanings.
Discourse Processing
The most difficult problem of AI is to process the natural language by computers or in other
words natural language processing is the most difficult problem of artificial intelligence. If we
talk about the major problems in NLP, then one of the major problems in NLP is discourse
processing − building theories and models of how utterances stick together to form coherent
discourse. Actually, the language always consists of collocated, structured and coherent groups
of sentences rather than isolated and unrelated sentences like movies. These coherent groups
of sentences are referred to as discourse.
❖ Concept of Coherence
Coherence and discourse structure are interconnected in many ways. Coherence, along with
property of good text, is used to evaluate the output quality of natural language generation
system. The question that arises here is what does it mean for a text to be coherent? Suppose
we collected one sentence from every page of the newspaper, then will it be a discourse? Of-
course, not. It is because these sentences do not exhibit coherence. The coherent discourse must
possess the following properties −
Coherence relation between utterances
The discourse would be coherent if it has meaningful connections between its utterances. This
property is called coherence relation. For example, some sort of explanation must be there to
justify the connection between utterances.
Relationship between entities
Another property that makes a discourse coherent is that there must be a certain kind of
relationship with the entities. Such kind of coherence is called entity-based coherence.
❖ Discourse structure
An important question regarding discourse is what kind of structure the discourse must have. The answer to this question depends upon the segmentation we applied on discourse. Discourse segmentations may be defined as determining the types of structures for large discourse. It is quite difficult to implement discourse segmentation, but it is very important for information retrieval, text summarization and information extraction kind of applications.
❖ Algorithms for Discourse Segmentation
In this section, we will learn about the algorithms for discourse segmentation. The algorithms are described below −
Unsupervised Discourse Segmentation
The class of unsupervised discourse segmentation is often represented as linear segmentation. We can understand the task of linear segmentation with the help of an example. In the example, there is a task of segmenting the text into multi-paragraph units; the units represent the passage of the original text. These algorithms are dependent on cohesion that may be defined as the use of certain linguistic devices to tie the textual units together. On the other hand, lexicon cohesion is the cohesion that is indicated by the relationship between two or more words in two units like the use of synonyms.
Supervised Discourse Segmentation
The earlier method does not have any hand-labeled segment boundaries. On the other hand, supervised discourse segmentation needs to have boundary-labeled training data. It is very easy to acquire the same. In supervised discourse segmentation, discourse marker or cue words play an important role. Discourse marker or cue word is a word or phrase that functions to signal discourse structure. These discourse markers are domain-specific.
❖ Text Coherence
Lexical repetition is a way to find the structure in a discourse, but it does not satisfy the requirement of being coherent discourse. To achieve the coherent discourse, we must focus on coherence relations in specific. As we know that coherence relation defines the possible connection between utterances in a discourse. Hebb has proposed such kind of relations as follows −
Reference Resolution
Interpretation of the sentences from any discourse is another important task and to achieve this we need to know who or what entity is being talked about. Here, interpretation reference is the key element. Reference may be defined as the linguistic expression to denote an entity or individual. For example, in the passage, Ram, the manager of ABC bank, saw his friend Shyam at a shop. He went to meet him, the linguistic expressions like Ram, His, He are reference. On the same note, reference resolution may be defined as the task of determining what entities are referred to by which linguistic expression.
❖ Terminology Used in Reference Resolution
We use the following terminologies in reference resolution − - Referring expression − The natural language expression that is used to perform reference is called a referring expression. For example, the passage used above is a referring expression. - Referent − It is the entity that is referred. For example, in the last given example Ram is a referent. - Corefer − When two expressions are used to refer to the same entity, they are called corefers. For example, Ram and he are corefers. - Antecedent − The term has the license to use another term. For example, Ram is the antecedent of the reference he. - Anaphora & Anaphoric − It may be defined as the reference to an entity that has been previously introduced into the sentence. And, the referring expression is called anaphoric. - Discourse model − The model that contains the representations of the entities that have been referred to in the discourse and the relationship they are engaged in.
❖ Types of Referring Expressions
Let us now see the different types of referring expressions. The five types of referring expressions are described below −
Indefinite Noun Phrases
Such kind of reference represents the entities that are new to the hearer into the discourse context. For example − in the sentence Ram had gone around one day to bring him some food − some is an indefinite reference.
Definite Noun Phrases
Opposite to above, such kind of reference represents the entities that are not new or identifiable to the hearer into the discourse context. For example, in the sentence - I used to read The Times of India – The Times of India is a definite reference.
NLP Unit 3 230603 100859
Course: BE IT (2019) (414442)
University: Savitribai Phule Pune University
- Discover more from: