0% found this document useful (0 votes)
6 views4 pages

exp-2 nlp

The document describes various programs for generating text using different natural language processing techniques, including random word selection, bigrams, trigrams, Markov chains, and the GPT-2 model. Each program demonstrates how to generate words or sentences based on a given input or starting word. Outputs from these programs include randomly selected words and generated sentences that reflect the structure of the input text.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views4 pages

exp-2 nlp

The document describes various programs for generating text using different natural language processing techniques, including random word selection, bigrams, trigrams, Markov chains, and the GPT-2 model. Each program demonstrates how to generate words or sentences based on a given input or starting word. Outputs from these programs include randomly selected words and generated sentences that reflect the structure of the input text.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Program:

import random

vocabulary=["natural","language","preprocessing","machine","learning","artificial","intelligence"]

random_word=random.choice(vocabulary)

print("Random Word:",random_word)

OUTPUT:

Random Word: natural

program

from nltk.util import bigrams

from nltk.probability import FreqDist

from random import choices

text="natural language processing is fascinating and language is powerful"

tokens=text.split()

bigrams_list=list(bigrams(tokens))

bigram_fd=FreqDist(bigrams_list)

def generate_next_word(last_word,bigram_fd):

possible_bigrams=[pair for pair in bigram_fd if pair[0]==last_word]

if not possible_bigrams:

return None

words,weights=zip(*[(pair[1],bigram_fd[pair])for pair in possible_bigrams])

return choices(words,weights=weights)[0]

start_word="language"

generated_word=generate_next_word(start_word,bigram_fd)

print("Next Word for '{}':{}".format(start_word,generated_word))


output:

Next Word for 'language':is

Program:

from nltk.util import trigrams

from random import choices

text = "natural language processing is fascinating and language is powerful"

tokens = text.split()

trigrams_list = list(trigrams(tokens))

from nltk import FreqDist

trigram_fd = FreqDist(trigrams_list)

def generate_sentence(trigram_fd, start_words, length=10):

sentence = list(start_words)

for _ in range(length - len(start_words)):

possible_trigrams = [trigram for trigram in trigram_fd if trigram[:2] == tuple(sentence[-2:])]

if not possible_trigrams:

break

words, weights = zip(*[(trigram[2], trigram_fd[trigram]) for trigram in possible_trigrams])

next_word = choices(words, weights=weights)[0]

sentence.append(next_word)

return " ".join(sentence)

start_words = ("language", "is")

generated_sentence = generate_sentence(trigram_fd, start_words)


print("Generated Sentence:", generated_sentence)

Output:

Generated Sentence: language is powerful

Program:

from collections import defaultdict

import random

text="natural language processing is fascinating and language is powerful."

tokens=text.split()

markov_chain=defaultdict(list)

for i in range(len(tokens)-1):

markov_chain[tokens[i]].append(tokens[i+1])

def generate_text(markov_chain,start_word,length=10):

current_word=start_word

result=[current_word]

for _ in range(length-1):

next_words=markov_chain.get(current_word)

if not next_words:

break

current_word=random.choice(next_words)

result.append(current_word)

return " ".join(result)

start_word="language"

generated_text=generate_text(markov_chain,start_word)

print("Generated Text:",generated_text)
Output

Generated Text: language is fascinating and language processing is powerful.

Program

from transformers import GPT2LMHeadModel, GPT2Tokenizer

model_name = "gpt2"

model = GPT2LMHeadModel.from_pretrained(model_name)

tokenizer = GPT2Tokenizer.from_pretrained(model_name)

prompt = "Natural Language Processing"

inputs = tokenizer.encode(prompt, return_tensors="pt")

outputs = model.generate(inputs, max_length=50, num_return_sequences=1)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("Generated Text:", generated_text)

Output:

Generated Text: Natural Language Processing (LISP) is a new approach to processing and
processing data in a language. It is a new approach to processing and processing data in
a language. It is a new approach to processing and processing data in a language.

You might also like