Chintu
Chintu
Regulation : R18/JNTUH
Prepared By
MISSION
1. To provide high-quality education along with professional training and exposure to the workplace.
2. To encourage a professional mindset that goes beyond academic achievement.
3. To promote holistic education among Department students by means of integrated pedagogy and
scholarly mentoring for excellence in both personal and professional domains.
4. To consistently enhance the teaching and learning procedures in order to prepare students for successful
careers in business or overseas or in further education.
5. To carefully prepare students to be Globally employable professionals who will meet societal demands
and contribute to the nation's technological advancement through their research and innovative talents.
MISSION
1. To provide qualitative education and generate new knowledge by engaging in cutting edge research and
by offering state of the art undergraduate, post graduate, leading careers as computer professional in the
widely diversified of industry, government and academia.
2. To promote a teaching and learning process that yields advancements in state of art in computer science
and engineering in integration of research result and innovative into other scientific discipline leading to
new products.
3. To harness human capital for sustainable competitive edge and social relevance by including the
philosophy of continuous learning and innovation in computer science and engineering.
Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of the
PO8
engineering practice.
Individual and team work: Function effectively as an individual, and as a member or leader In diverse teams,
PO9
and in multi-disciplinary settings.
Communication: Communicate effectively on complex engineering activities with the engineering community
PO10 and with society at large, such as, being able to comprehend and write effective reports and design
documentation, make effective presentations, and give and receive clear instructions.
Project management and finance: Demonstrate knowledge and understanding of the Engineering and
PO11 management principles and apply these to one’s own work, as a member and leader in a team, to manage
projects and in multidisciplinary environments
Life-long learning: Recognize the need for, and have the preparation and ability to engage in independent and
PO12
life-long learning in the broadest context oftechnological change.
Course Objectives
To provide hands-on experience on web technologies.
To develop client-server application using web technologies
To introduce server-side programming with Java servlets and JSP
To understand the various phases in the design of a compiler
To understand the design of top-down and bottom-up parsers.
To understand syntax directed translation schemes.
To introduce lex and yacc tools.
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 2 - - 1 - - - - - - - - - - -
CO2 2 - - 1 - - - - - - - - 1 - -
CO3 2 1 - 1 - - - - - - - - 1 1 -
CO4 2 2 2 1 - - - - - - - - 2 2 1
0021
Course Objectives: Exposure to various web and social media analytic techniques.
Course Outcomes:
List of Experiments
Resources:
2. GOOGLE.COM/ANALYTICS
TEXT BOOKS:
REFERENCE BOOKS:
1. RajivSabherwal, Irma Becerra- Fernandez,” Business Intelligence –Practice, Technologies and Management”, John Wiley 2011.
2. Lariss T. Moss, Shaku Atre, “Business Intelligence Roadmap”, Addison-Wesley It Service.
3. Yuli Vasiliev, “Oracle Business Intelligence: The Condensed Guide to Analysis and Reporting”, SPD Shroff, 2012.
PROGRAM - 1
a) Stopword Elimination
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
stop_words = set(stopwords.words('english'))
word_tokens = word_tokenize(example_sent)
# converts the words in word_tokens to lower case and then checks whether
#they are present in stop_words or not
filtered_sentence = [w for w in word_tokens if not w.lower() in stop_words]
#with no lower case conversion
filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)
print(word_tokens)
print(filtered_sentence)
OUTPUT :
word_tokens = ['This', 'is', 'a', 'program', 'for', 'stop', 'list', ',', 'so', 'filter', 'the', 'words', 'students', '.']
filtered_sentence = ['This', 'program', 'stop', 'list', ',', 'filter', 'words', 'students', '.']
VIVA Questions:
1. What are stopwords and why are they removed from text data?
2. Can you provide examples of some common stopwords?
3. How can the removal of stopwords affect the performance of a text classification model?
b) Stemming:
# Stem each word in the list and print the root word
for w in e_words:
rootWord = ps.stem(w)
print(rootWord)
Output:
wait
wait
wait
wait
import nltk
from nltk.stem.porter import PorterStemmer
Output:
Actual: It Stem: it
Actual: originated Stem: origin
Actual: from Stem: from
Actual: the Stem: the
Actual: idea Stem: idea
Actual: that Stem: that
Actual: there Stem: there
Actual: are Stem: are
Actual: readers Stem: reader
Actual: who Stem: who
Actual: prefer Stem: prefer
Actual: learning Stem: learn
Actual: new Stem: new
Actual: skills Stem: skill
Actual: from Stem: from
Actual: the Stem: the
Actual: comforts Stem: comfort
Actual: of Stem: of
Actual: their Stem: their
Actual: drawing Stem: draw
Actual: rooms Stem: room
VIVA Questions :
1. What is stemming and how does it differ from lemmatization?
2. Name some commonly used stemming algorithms.
3. What are the potential drawbacks of using stemming?
d) Lemmatization:
import nltk
from nltk.stem import WordNetLemmatizer
Output:
Actual: It Lemma: It
Actual: originated Lemma: originated
Actual: from Lemma: from
Actual: the Lemma: the
Actual: idea Lemma: idea
Actual: that Lemma: that
Actual: there Lemma: there
Actual: are Lemma: are
Avanthi Institute of Engineering and Technology
AVANTHI INSTITUTE OF ENGINEERING AND TECHNOLOGY
(Approved by AICTE, Recg. By Govt. of T.S & Affiliated to JNTUH, Hyderabad)
NAAC “B++” Accredited Institute
Gunthapally (V), Abdullapurmet(M), RR Dist, Near Ramoji Film City, Hyderabad -501512.
www.aietg.ac.in email: [email protected]
VIVA Questions :
1. What is lemmatization and how does it differ from stemming?
2. How does lemmatization handle different parts of speech?
3. Why is lemmatization considered more accurate than stemming?
e) POS Tagging:
import nltk
Output:
[('I', 'PRP'), ('am', 'VBP'), ('learning', 'VBG'), ('NLP', 'NNP'), ('in', 'IN'), ('Python', 'NNP')]
f) spaCy Program:
import spacy
# Load the 'en_core_web_sm' model
nlp = spacy.load('en_core_web_sm')
# Define the sentence
sentence = "I am learning NLP in Python"
# Process the sentence using spaCy's NLP pipeline
doc = nlp(sentence)
# Iterate through the tokens and print the token text and POS tag
for token in doc:
print(token.text, token.pos_)
Output:
I PRON
am AUX
learning VERB
NLP PROPN
in ADP
Python PROPN
VIVA Questions :
1. What is POS tagging and why is it important in NLP?
2. Can you list the common POS tags used by NLTK?
3. How does POS tagging contribute to the understanding of a sentence's structure?
g) Lexical analysis:
Output:
VIVA Questions :
1. What is lexical analysis in the context of NLP?
2. How does tokenization help in text preprocessing?
3. Can you explain the difference between tokenization at the word level and at the sentence level?
PROGRAM - 2
# Inspect the first few rows and summary statistics of the dataset
print(data.head())
print(data.describe())
# Inspect the first few rows of the dataset with sentiment scores
print(data.head())
z = sum(data["Neutral"])
Output:
0 1 1 5 1303862400
1 0 0 1 1346976000
2 1 1 4 1219017600
3 3 3 2 1307923200
4 0 0 5 1350777600
Summary Text
0 Good Quality Dog Food I have bought several of the Vitality canned d...
2 "Delight" says it all This is a confection that has been around a fe...
3 Cough Medicine If you are looking for the secret ingredient i...
VIVA Questions :
PROGRAM – 3
3. Web analytics
a) Web usage data (web server log data, clickstream analysis)
import pandas as pd
import matplotlib.pyplot as plt
import re
print(df.head())
Output:
IP User Timestamp Request Status Size
0 127.0.0.1 frank 2000-10-10 13:55:36-07:00 GET /apache_pb.gif HTTP/1.0 200 2326
1 127.0.0.1 frank 2000-10-10 13:56:00-07:00 POST /form HTTP/1.1 404 321
2 192.168.0.1 jane 2000-10-11 14:05:36-07:00 GET /index.html HTTP/1.0 200 124
3 10.0.0.1 bob 2000-10-12 15:15:36-07:00 GET /about HTTP/1.0 500 532
VIVA Questions :
b) Hyperlink Data:
def analyze_hyperlinks(links):
# Count occurrences of each hyperlink
link_counts = Counter(links)
# Display results
print(f"Total hyperlinks: {total_links}")
print(f"Unique hyperlinks: {unique_links}")
print("\nHyperlink occurrences:")
for link, count in most_common_links:
print(f"{link}: {count}")
Output:
Total hyperlinks: 10
Unique hyperlinks: 6
Hyperlink occurrences:
https://example.com/page1: 3
https://example.com/page2: 3
https://example.com/page3: 2
https://example.org/home: 1
https://example.org/about: 1
VIVA Questions :
PROGRAM - 4
Spamdexing, also known as search engine spamming or search engine poisoning, refers to various methods used
to manipulate search engine rankings to favor certain pages in ways that violate the search engine's terms of
service. This is generally considered unethical and is penalized by search engines. However, for educational
purposes, we can implement a basic example to understand how such techniques work.
return spam_content
# Example usage
keywords = ["buy cheap products", "best prices", "discount sales", "online shopping"]
original_content = """
Welcome to our online store. We offer a wide range of products at the best prices.
Browse through our collection and find the best deals for your needs.
"""
print("Original Content:\n")
print(original_content)
print("\nSpam Content:\n")
print(spam_content)
Output:
Original Content:
Welcome to our online store. We offer a wide range of products at the best prices.
Browse through our collection and find the best deals for your needs.
Spam Content:
Welcome to our online store. We offer a wide range of products at the best prices.
Browse through our collection and find the best deals for your needs.
buy cheap products best prices discount sales online shopping buy cheap products best prices discount
sales online shopping buy cheap products best prices discount sales online shopping buy cheap products
best prices discount sales online shopping buy cheap products best prices discount sales online shopping
buy cheap products best prices discount sales online shopping buy cheap products best prices discount
sales online shopping buy cheap products best prices discount sales online shopping buy cheap products
best prices discount sales online shopping buy cheap products best prices discount sales online shopping
VIVA Questions :
1. What is spamdexing, and how does it affect search engine optimization (SEO)?
2. Can you describe some common techniques used in spamdexing?
3. What are the potential consequences of engaging in spamdexing for a website?
4. How do search engines detect and combat spamdexing?
5. What are some ethical SEO practices that can help avoid spamdexing?
PROGRAM - 5
import json
KEY_FILE_LOCATION = 'path/to/your/service-account-file.json'
VIEW_ID = 'YOUR_VIEW_ID'
def initialize_analyticsreporting():
credentials = service_account.Credentials.from_service_account_file(
KEY_FILE_LOCATION,
scopes=['https://www.googleapis.com/auth/analytics.readonly'])
return analytics
def get_report(analytics):
return analytics.reports().batchGet(
body={
'reportRequests': [
'viewId': VIEW_ID,
).execute()
def print_response(response):
print(f'{metricHeader.get("name")}: {value}')
def main():
analytics = initialize_analyticsreporting()
response = get_report(analytics)
print_response(response)
if __name__ == '__main__':
main()
1. Replace KEY_FILE_LOCATION with the path to your service account JSON file.
2. Replace VIEW_ID with your Google Analytics view ID.
3. Run the script: Execute the script in your Python environment.
Output :
The script will output the conversion statistics for the past 30 days, showing the number of
goal completions and conversion rates per day.
VIVA Questions :
1. What are conversion statistics in Google Analytics, and why are they important for
businesses?
2. How do you set up and track a goal in Google Analytics to monitor conversions?
3. What is the difference between macro and micro conversions, and how can both be tracked in
Google Analytics?
4. Can you explain what a conversion funnel is and how Google Analytics helps in analyzing it?
PROGRAM - 6
import json
from google.oauth2 import service_account
from googleapiclient.discovery import build
def initialize_analyticsreporting():
"""Initializes the analytics reporting service object."""
credentials = service_account.Credentials.from_service_account_file(
KEY_FILE_LOCATION, scopes=['https://www.googleapis.com/auth/analytics.readonly'])
analytics = build('analyticsreporting', 'v4', credentials=credentials)
return analytics
def get_report(analytics):
"""Queries the Analytics Reporting API V4."""
return analytics.reports().batchGet(
body={
'reportRequests': [
{
'viewId': VIEW_ID,
'dateRanges': [{'startDate': '30daysAgo', 'endDate': 'today'}],
'metrics': [{'expression': 'ga:sessions'}, {'expression': 'ga:users'}],
'dimensions': [
{'name': 'ga:country'},
{'name': 'ga:city'},
{'name': 'ga:userType'},
{'name': 'ga:deviceCategory'},
{'name': 'ga:browser'},
{'name': 'ga:operatingSystem'},
{'name': 'ga:age'},
{'name': 'ga:gender'}
Avanthi Institute of Engineering and Technology
AVANTHI INSTITUTE OF ENGINEERING AND TECHNOLOGY
(Approved by AICTE, Recg. By Govt. of T.S & Affiliated to JNTUH, Hyderabad)
NAAC “B++” Accredited Institute
Gunthapally (V), Abdullapurmet(M), RR Dist, Near Ramoji Film City, Hyderabad -501512.
www.aietg.ac.in email: [email protected]
]
}
]
}
).execute()
def print_response(response):
"""Parses and prints the Analytics Reporting API V4 response."""
for report in response.get('reports', []):
columnHeader = report.get('columnHeader', {})
dimensionHeaders = columnHeader.get('dimensions', [])
metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])
rows = report.get('data', {}).get('rows', [])
def main():
analytics = initialize_analyticsreporting()
response = get_report(analytics)
print_response(response)
if __name__ == '__main__':
main()
Output :
The script will output visitor profile information for the past 30 days, including the number
of sessions, users, and details about the visitors such as their country, city, user type, device
category, browser, operating system, age, and gender.
VIVA Questions :
1. What is a visitor profile in Google Analytics, and why is it important for understanding
website traffic?
2. How can you use Google Analytics to segment visitors based on demographics and
interests?
3. What is the importance of using custom segments in Google Analytics, and how do you
create one?
PROGRAM - 7
Here is a Python script to authenticate and retrieve traffic sources information from Google
Analytics:
import json
KEY_FILE_LOCATION = 'path/to/your/service-account-file.json'
VIEW_ID = 'YOUR_VIEW_ID'
def initialize_analyticsreporting():
credentials = service_account.Credentials.from_service_account_file(
KEY_FILE_LOCATION, scopes=['https://www.googleapis.com/auth/analytics.readonly'])
return analytics
def get_report(analytics):
return analytics.reports().batchGet(
body={
'reportRequests': [
'viewId': VIEW_ID,
'dimensions': [
{'name': 'ga:source'},
{'name': 'ga:medium'},
{'name': 'ga:campaign'}
).execute()
def print_response(response):
print(f'{metricHeader.get("name")}: {value}')
print('\n')
def main():
analytics = initialize_analyticsreporting()
response = get_report(analytics)
print_response(response)
if __name__ == '__main__':
main()
1. Replace KEY_FILE_LOCATION with the path to your service account JSON file.
2. Replace VIEW_ID with your Google Analytics view ID.
3. Run the script: Execute the script in your Python environment.
Output :
The script will output traffic sources information for the past 30 days, including the number of sessions,
users, and details about the traffic sources such as source, medium, and campaign.
VIVA Questions :
a. What are traffic sources in Google Analytics, and why are they important for
understanding website performance?
b. How do you categorize traffic sources in Google Analytics, and what are the main
categories?
c. How can you set up and track UTM parameters to measure the effectiveness of specific
traffic sources?