0% found this document useful (0 votes)
18 views

CS6010 2marks QB

SNA

Uploaded by

nirmaladevi.d2
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

CS6010 2marks QB

SNA

Uploaded by

nirmaladevi.d2
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

www.AUNewsBlog.

net

SOCIAL NETWORK ANALYSIS

UNIT I : INTRODUCTION

PART–A(2Marks)

1. Whatisthemainfunctionofsemanticweb?

 SemanticwebisacollaborativeeffortbytheW3Canditisusedtopromotethecommonformats for data.


 ItremovesambiguityfromtheformofdatabeingrepresentedontheWorldWideWeb.
 Semantic web allows the inclusion of the semantic content that describes the format in the web pages.
 Itconvertstheunstructuredcontentonthewebandmakesitmorestructuredforthedailyuse.
 ItconsistsofthewebofdatatorepresentitonthewebsiteandbuilditusingtheResource
DescriptionFramework.

2. WhyisSemanticWebusedincurrentsystem?

 SemanticWebprovidesframeworkonwhichtheapplicationscanbemadeanddevelopedusing the tools.


 Itallowsthedatatobesharedandreusedbetweenmanyapplicationsandotherenterpriselevel applications.
 W3C also known as World Web Consortium uses thedevelopment libraries forthe Semantic web standards.
 The semantic web describes the web of data that can be directly or indirectly gets executed on theclient
machine.
 Thisusesdisambiguity principlethatdoesn’t allow the ambiguoussolutiontobeprovidedwith the system.

3. WhatisthepurposeofSemanticWeb?

 Semanticweballowedtheusertofind,shareandcombinetheinformationtotransferitfromone place to another


very easily.
 Itallowstheworkingofcurrentwebandmakesitmoresecureandusablebyshowingthe information.
 The users are allowed to use the Web for carrying outthe tasks of finding the folder or categories
anditjustmakesiteasy.
 Semanticwebprovidestheinstructionsforthemachinetoexecutethetasksbyprovidingthe interpreter that can
interpret it.
 MachinescanperformthetaskprovidedbytheSemanticWebanditinvolvesfinding,combining
andactingontheinformationthatispresentontheweb.

www.AUNewsBlog.net
www.AUNewsBlog.net

4. WhyisSemanticWebsousefulforthedevelopmentofweb?

 Semantic web provides the instructions for the machines that can be understood and the responseshould be
produced from the system.
 Semanticwebprovidesaninterpreterthatcaninterprettheinstructionstothemachineand translate them further
to make it in human readable form with their meaning.
 Semanticwebprovidestheinformationregardingthedataformatthatrequiresunderstandingof the semantically
structured data.
 Semanticweb allowstheuserto use the toolstoanalyze thedata on theweb anditalsohave the
content,linksandothertransactionsbetweenthepeople.
 It provides the applications in many areas like blogging, publishing, etc. This way the applications can be
created and circulated around.

5. WhyisSemanticwebregardedasintegrator?

 Integrator allows more than one data to be integrated using different content and information in thesystem.
 Semantic web provides the integrator that runs across differentplatforms for the applications that need to be
published on the web.
 Semanticwebprovidesthesemanticsorthemetadataforthewebthatcanbeusedtorepresent
thestatusmodelreflectingthecurrenttechnologies.
 Itprovidesandsupportdifferentfieldstobeintegratedinonetechnologyandcanbeworkedupon.
 Itprovidestoolsthatcanbesupportedbyapplicationsandintegratedinaplatformthatisusedto
createtheapplications.

6. WhatarethelimitationsofHTML?

 HTMLisalsoknownasHyperTextMarkupLanguageprovidesthecreationofthewebpages.

 TheHTMLpagesarethedocumentsthatcanbereadbytheserver,andarenotthebestfittobe read by humans.

 HTMLformshavethedependencyonscriptinglanguagesanditresultsincomplexdocument creation that


consumes more time.

 HTML doesn’t initialize the form data properly and doesn’t make it easier for the users toenter theinformation
once.

 HTMLishavingsomelimitationswiththeuseofformsthatdoesn’tallowencodingformats, urlencoded or
multipart forms.

www.AUNewsBlog.net
www.AUNewsBlog.net

7. WhyisHTMLusedinSemanticweb?

 HTMLisastandardlanguagethatcommunicatesbetweentheserverandtheclient’ssystem.
 Thefilesthataregivenonthecomputercanbedividedintohumanandmachinereadableform.
 Most of the documents are written in HTML form and it uses multimedia objects in a better way byusing the
images and forms.
 HTMLisastandardoutputmethodforrespondingtotheclient’srequestandrespondaccordingly.
 HTML provides a way to generate the response of theweb when the client request any data from the server.

8. WhatisthelimitationofHTMLforms?

 HTMLformsarehardtoinitializethedataoftheformanditprovidesnouserexperienceasuser needs to remember


the form information.
 HTMLformprovidesauniquecontrolofdefiningthedatathatisinitiallybeingfilledup.
 It uses the small bits of initialization data that is present in the overall documentwhile defining the control.
 Anewformneedstobeconstructedtofilltheformagainasitholdsnodataasthebackuptofill theinformationwith.
 A template replacement facility is not being provided on application servers that stores the data
anddoesn’tallowtheuserstofillupagainandagain.

9. WhatarethedesignflawsinvolvedinHTMLforms?

 ThedesignflawsareinvolvedinHTMLasitprovidesonestepprocessi.e.fromclienttoserver.
 Theprocessingfinishesthereanditdoesn’tprovidefurtherprocessestobedoneontheforms.
 FormsinvolvethecomplicatedpathtotraverseandHTMLfailedtomakethetraversingeasier.
 Management of the HTML forms isn’t easy as it requires reinterpreting the data format at every
stageofthelifecycle.
 HTMLformsarenotusedduetoitsbadmanagementandtheprovisionsthatarebeingprovided for creations and
modification.

10. WhatisbeingprovidedbyMetadatatags?

 Metadata tags provide the keywords that are used for the search engine to make the website or the web page
search engine friendly.
 It is a method to categorize the content of the web pages on the search engine so that it can be easily found
by the browsers.
 Metadatatagsarerepresentedas:
<metaname="keywords"content="computing,computer,comp"/>
<metaname="description"content="Helloworld"/>
<metaname="author"content="RohitKumar"/>
 Metadata tags provide good description in the tags and allow the content to be displayed for better
performance of the web pages.

www.AUNewsBlog.net
www.AUNewsBlog.net

11. WhataretheactivitiesperformedusingHTML?

 HTMLisatoolthatallowstherenderingofthewebpagesandcreationofitusingtheeditor.
 Thewebpagecanbecreatedwitheasytousetags,browsercompatiblecodeandlistofitems.
 SimpledocumentationcanbecreatedusingthetoolsthatisbeingprovidedbyHTML.
 Images can be displayed in varietyofwaysand textcan be made floated using the specialtags
definedinHTMLversion.
 Thepiecesofinformationcanbecombinedtogethertodescribetheitemsandotheritemson differentwebpages.

12. WhatisthefunctionofsemanticHTML?

 Semantic HTML provides the traditional methodologies to work and markup the codeaccording tothe
guidelines.
 Itdoesn’tspecifythelayoutdetailsinwhichtheHTMLneedstobepresentedorwritten.
 SemanticHTMLusestheoldtagslike<em>thatdenotesemphasisratherthan<i>tagthatused to denote italics.
 Layoutdetailsarewebbrowserdependentanditisplacedaccordingtothecombinationof Cascadingstylesheets.
 The semantic of the objects are also not described by the use of items and by using their sales and price
details.

13. WhatistheuseofSemanticWebsolutions?

 Semanticwebsolutionsprovidepublishingmethodologiesthatisdesignedforthedata.
 ResourcedescriptionframeworkorRDFisalsousedandincludedinthesemanticwebsolutions.
 Thetechnologiesusedinarebeingcombinedtoprovidethedescriptionsandreplacementofthe web documents.
 Webontologylanguagesareusedtodescribethelinksbetweenvarioustextsandlanguages.
 Itincludesamanifestthatconsistsofallthedescriptivedatastoredintheweb-accessible databases.
 The markup is used within the documents that are related to XML and the layout is being renderedusing it.

14. Whatarethefunctionsofmachinereadabledescriptions?

 Machinereadabledescriptionsallowthemanagerstomanagethecontentbyaddingthemeaning tothecontentused.
 Itprovidesastructuredknowledgeofthesystemforwhichthecontentisbeingwritten.
 Machine processes the knowledge of changing the content using the processes by reasoning andinference.
 Itprovidesmeaningfulresourcesandresultsthatcanbeusedtoperformtheinformationtask automatically and
more easily.
 Research gathering information is being provided in the semantic web solutions and provides the content to
be written accordingly.

www.AUNewsBlog.net
www.AUNewsBlog.net

15. Whataretheexamplesofusingthenon-semanticwebpage?

 To make the web page more meaningful by adding the content or performing the automated tasks
semanticwebisused.
 Non-Semantic web page is used to provide the easy to use tags in there and get the functions performed
to execute the tasks.
 Thetagsthatareused:
<item>cat</item>
 This tag provides an easy way to represent the information without following a pattern like semantic web
pages.
Semanticwebpagesaredescribedlikeforthesamewebpagecontent:
<itemrdf:about="http://hello.org/Cat">Cat</item>

16. Whatarethewaysinwhichthewebpagecanbeaccessed?

 Thewebpagerequiressomefunctionsthatallowaccessingofitinaneasyandcomfortableway.
 There are three ways in which theweb page can be accessed and the data can be retrieved fromit.
 Thethreewaysareasfollows:

- TheURLfirstshouldalwayspointtothedatathatneedstoberepresentedoraccessed.
- AccessingoftheURLshouldprovidethedatabacktotheclientthathasrequestedforit.
- The relationship between the data and the server is represented in such a way that it points in
additionalURLsaswell.
- TheotherURLsconsistofthedataresidingontheirserverthroughwhichitcanbeaccessed.

17. Whatarethechallengesfacedbythetechnology?

 Thechallengethatisbeingprovidedbysemanticwebincludesthefollowing:
 Vastness: this includes the large group of pages that is being accessed by the users using the
existingtechnology.
 Thisconsistsofany automatedsystemthatisgood inreasoningand dealswiththeveryhigh inputs.
 Vagueness:itoccursduetothequeriesthatarebeingprovidedbythecontentproviders.
 Ifthequerytermsarematchedthentheknowledgecanbecombinedtogethertofindthe knowledge.
 Uncertainty:includesuncertainvaluethatcanprovidethecorrespondenceusingthedifferent probability.
 Inconsitency:istheverybigchallengethatprovideslogicalcontradictionsbetweentheontologies.
 Itcombinestheresourcestoanswerthequestionsthatarebeingraisedbythetheoriesand sources of it.

www.AUNewsBlog.net
www.AUNewsBlog.net

18. WhatarethedifferentcomponentsusedinSemanticweb?

 Semantic web uses different formats and technologies that enables it to provide great extent on the web.
 Semanticwebprovidesthecollectionofdatathatarehavingrelationshipwitheachother.
 Italsohasthecomponentsthatareenabledbytechnologiesandprovidethedescriptionof concepts, terms and
relationships.
 Thecomponentsthatareusedinsemanticwebfollows:
 Resource Description Framework (RDF): thisisused as a method todefine the informationand general
queries of the system.
 RDFSchema(RDFS):thisconsistsofthefiledatatypeformatandhelpsinstoringthedata.
 SimpleKnowledgeOrganizationSystem(SKOS)
 SPARQL,anRDFquerylanguage

19. WhatisthefunctionofSemanticWebStack?

 SemanticwebstackisusedtoprovidearchitecturefortheSemanticwebanditdealsin relationships related to


the components.
 Semantic web stack provides the functions to be used in the components and provides the contentstructure.
 SyntaxoftheXMLcanbeprovidedwithinthedocumentsandithastheassociationwithno
semanticshavingthemeaning ofthecontent.
 XMLisrepresentedasthemajorcomponentusedwiththetechnologiesanditprovidesthe process to be made
standardized.
 Semantic web stack uses the programs and store it in the stack so the technologies are gathered at one place
and used for the benefit to provide something easy and useful.

20. Explainthecomponentsofthesemanticwebindetail?

 Thecomponentsusedinsemanticwebareasfollows:
 XML schema is used to store the data but it also provides and restricts the user to use the structure and content
of the element being in use.
 RDFprovidesanexpressingdatamodels.Italsodealsintherelationshipsbetweenthem.
 ThemodelbasedinRDFcanberepresentedinmuchsyntaxthatmeetsthestandardwebquality.
 RDFschemaprovidedaddedfunctionalitythatdescribesthepropertiesandclassesusingthe based resources.
 Thesemanticwebisusedinageneralized-hierarchywaythatcanbeusedwithclientsand properties.

www.AUNewsBlog.net
www.AUNewsBlog.net

21. WhataretheSecurityDesignPrinciplesusedinWebSecurity?

 TheSecurityDesignPrinciplesusedinWebSecurityareasfollows:
 Least Privilege: this provides the security for the system and provides a way tolimit the resourcesgiven to a
process when it starts.
 Defence in Depth: the defence of the website is to provide the depth in the content such that it becomes
hard for someone to break it.
 Secure Weakest Link: this way the security can’t be breached as most of the attacks will be on theweak links
only.
 Fail-safeStance:provideawaytohavethesecuritysuchthatifonesecurityfailsthenitwillhave
themodelthatwillsupportit.
 SecureByDefault:therearesecuritythatcanbeprovidedbydefaulttosecurethewebsitesfrom being hacked.
 Simplicity:thedesignprinciplesofthewebsiteshouldbesimpletouseanditshouldbeeasy customizable.
 Usability:thedesignofthewebsiteshouldbeusablesuchthatanyonecanusethewebsite.

www.AUNewsBlog.net
www.AUNewsBlog.net

PART–B(16Marks)

1. What are the limitations of current Web? Explain the development ofsemantic Web and the emergenceof Social

Web.

2. BrieflyexplainthedevelopmentofSocialNetworkAnalysis.

3. Enumeratethestaticpropertiesofsocialnetworks.

4. Explainthedynamicpropertiesofsocialnetworks.

5. IllustratetheGlobalstructureofnetworkswithanexample.

6. Discussindetailaboutthemacro-structureofsocialnetworks.

7. Enumeratethedifferentdimensionsofsocialcapitalandtheirrelatedconceptsandmeasures.

8. Brieflyexplainthefollowing:

a) Electronicdiscussionnetworks

b) Blogsandonlinecommunities

c) Web-basedNetworks

d) PersonalNetworks

9. Explainthestatisticalpropertiesofsocialnetworkanalysis.

10. DiscussthebusinessapplicationsofSocialNetworkAnalysis.

www.AUNewsBlog.net
www.AUNewsBlog.net

UNIT – II : MODELLING, AGGREGATING AND KNOWLEDGE REPRESENTATION

PART–A(2Marks)

1. What are the uses of statistics in data mining?


Statisticsisusedto
 toestimatethecomplexityofadataminingproblem;
 suggestwhichdataminingtechniquesaremostlikelytobesuccessful;and
 identifydatafieldsthatcontainthemost“surfaceinformation”.

2. What are the factors to be considered while selecting the sample in statistics?
Thesampleshouldbe
 Largeenoughtoberepresentativeofthepopulation.
 Smallenoughtobemanageable.
 Accessibletothesampler.
 Freeofbias.

3. Name some advanced database systems.


Object-oriented databases,
Object-relationaldatabases.

4. Namesomespecificapplicationorienteddatabases.
 Spatialdatabases,
 Time-seriesdatabases,
 Textdatabasesandmultimediadatabases.

5. Whatismeantbyrelationaldatabases?
Arelationaldatabaseisacollectionoftables,eachofwhichis assignedauniquename. Each table
consists of a set of attributes (columns or fields) and usually stores a large set of tuples
(recordsorrows).Eachtupleinarelationaltablerepresentsanobjectidentifiedbyauniquekey anddescribed bya
set ofattribute values.

6. Whatismeantbytransactionaldatabases?
Atransactionaldatabaseconsistsofafilewhereeachrecordrepresentsatransaction.A

Transaction typically includes a unique transaction identity number (trans_ID), and a list of the items
making up the transaction.

7. WhatisSpatialDatabases?
Spatial databases contain spatial-related information. Such databases include geographic (map)
databases, VLSI chip design databases, and medical and satellite image databases. Spatial data may be
represented in raster format, consisting of n-dimensional bit maps or pixel maps.

www.AUNewsBlog.net
www.AUNewsBlog.net

8. WhatisTemporalDatabase?
Temporal database store time related data .It usually stores relational data that include time
related attributes. These attributes may involve several time stamps, each having different semantics.

9. WhatareTime-Seriesdatabases?
A Time-Series database stores sequences of values that change with time, such as data Collected
regarding the stock exchange.

10. Whymachinelearningisdone?
 Tounderstandandimprovetheefficiencyofhumanlearning.
 Todiscovernewthingsorstructurethatisunknowntohumanbeings.
 Tofillinskeletalorcomputerspecificationsaboutadomain.

11. Givethecomponentsofalearningsystem.
 Critic
 Sensors
 LearningElement
 PerformanceElement
 Effectors
 Problemgenerators.

12. Whatarethestepsinthedataminingprocess?
 Datacleaning
 Dataintegration
 Dataselection
 Datatransformation
 Datamining
 Patternevaluation
 Knowledgerepresentation

13. Whatisdatacleaning?
Datacleaningmeansremovingtheinconsistentdata ornoiseandcollectingnecessary information

14. Whatisdatamining?
Dataminingisaprocessofextractingorminingknowledgefromhugeamountofdata.

15. Whatismeantbypatternevaluation?
Pattern evaluation is used toidentify the truly interesting patterns representing knowledge based on
some interesting measures.

www.AUNewsBlog.net
www.AUNewsBlog.net

16. Whatisknowledgerepresentation?
Knowledgerepresentationtechniquesareusedtopresenttheminedknowledgetothe
user.

17. WhatisVisualization?
Visualizationisfordepictionofdataandtogainintuitionaboutdatabeingobserved. ItAssists the
analysts in selecting display formats, viewer perspectives and data representation schema

18. WhatisSpatialVisualization?
Spatialvisualizationdepictsactualmembersofthepopulationintheirfeaturespace

19. WhatisDescriptiveandpredictivedatamining?
Descriptive data mining describes the data set in a concise and summertime manner and Presents
interesting general properties of the data. Predictive data mining analyzes the data in order to construct one
or set of models and attempts to predict the behavior of new data sets.

20. WhatisDataGeneralization?
It is process that abstracts a large set of task-relevant data in a database from a relativelylow
conceptual to higher conceptual levels 2 approaches for Generalization
a. Datacubeapproach
b. Attribute-orientedinductionapproach

21. WhatismeantbyAttributeOrientedInduction?
These method collets the task-relevantdata usinga relational database queryand then perform
generalization based on the examination in the relevant set of data.

22. Whatisbootstrap?
An interpretation of the jack knife is that the construction of pseudo value is based on
Repeatedly and systematically sampling with out replacement from the data athand. This lead to generalized
concept to repeated sampling with replacement called bootstrap.

23. Whatismeantbytheviewofstatisticalapproach?
Statistical method is interested in interpreting the model. It may sacrifice some performance
to be able to extract meaning from the model structure. If accuracy is acceptable then the reason that a model
can be decomposed in to revealing parts is often more useful than a 'black box' system, especially during early
stages of investigation and design cycle.

24. Whatisdeterministicmodels?
Deterministic models, which takes no account of random variables, but gives precise, fixed
reproducibleoutput.

www.AUNewsBlog.net
www.AUNewsBlog.net

25. Whatismenatbysystemsandmodels?
SystemisacollectionofinterrelatedobjectsandModelisadescriptionofasystem.
Modelsareabstract,andconceptuallysimple.

26. Whatarethewaysthemodelsareexplored?
Allthingsbeingequal,thesmallestmodelthatexplainstheobservationsandfitsthe objectives that
should be accepted. In reality, the smallest means the model should optimizes a certain scoring function
(e.g. Least nodes, most robust, least assumptions)

27. Whatisclustering?
Clusteringis the process of groupingthe dataintoclasses or clusters so thatobjects withina cluster have
high similarity in comparison to one another, but are very dissimilar to objects in other clusters.

28. Whataretherequirementsofclustering?
 Scalability
 Abilitytodealwithdifferenttypesofattributes
 Abilitytodealwithnoisydata
 Minimalrequirementsfordomainknowledgetodetermineinputparameters
 Constraintbasedclustering
 Interpretabilityandusability

29. Statethecategoriesofclusteringmethods?
 Partitioningmethods
 Hierarchicalmethods
 Densitybasedmethods
 Gridbasedmethods
 Modelbasedmethods

30. Whatislinearregression?
In linear regression data are modeled using a straight line. Linear regression is the simplest
form of regression. Bivariate linear regression models a random variable Y called response variable
as a linear function of another random variable X, calleda predictor variable.

Y=a+bX

31. Statethetypesoflinearmodelandstateitsuse?
Generalizedlinearmodelrepresentthetheoreticalfoundationonwhichlinearregression
canbeappliedtothemodelingofcategoricalresponsevariables.Thetypesofgeneralizedlinear model are
Logisticregression
Poissonregression

www.AUNewsBlog.net
www.AUNewsBlog.net

32. Writethepreprocessingstepsthatmaybeappliedtothedataforclassificationandprediction.
 DataCleaning
 RelevanceAnalysis
 DataTransformation

33. Whatisdataclassification?
It is a two-step process. In the first step, a model is built describing a pre-determined set of data
classes or concepts. The model is constructed by analyzing database tuples described by
attributes.Inthesecondstepthemodelisusedforclassification.

34. Whatisa“decisiontree”?
It is a flow-chart like tree structure, where each internal node denotes a test on an attribute,
each branch represents an outcome of the test, and leaf nodes represent classes or class
distributions.Decisiontreeisapredictivemodel.Eachbranchofthetree isaclassification question and
leavesof the tree are partition of the dataset with their classification.

35. Wherearedecisiontreesmainlyused?
UsedforexplorationofdatasetandbusinessproblemsDatapreprocessingforother predictive analysis
Statisticians use decision trees for exploratory analysis

36. WhatisAssociationrule?
Association rule finds interesting association or correlation relationships among a large setof data
items, which is used for decision-making processes. Association rules analyzes buying
patternsthatarefrequentlyassociatedorpurchasedtogether.

37. Whatismenatbysupport?
Supportistheratioofthenumberoftransactionsthatincludeallitemsintheantecedent and consequent
parts of the rule to the total number of transactions. Support is an association rule interestingness measure.

38. Whatisconfidence?
Confidence is the ratio of the number of transactions that include all items in the
consequent as well as antecedent to the number of transactions that include all items in antecedent.
Confidence is an association rule interestingness measure.

39. Howaretheassociationrulesminedfromlargedatabases?
 Associationruleminingisatwo-stepprocess.
 Findallfrequentitemsets.
 Generatestrongassociationrulesfromthefrequentitemsets.

www.AUNewsBlog.net
www.AUNewsBlog.net

40. WhataretheadvantagesofDimensionalmodeling?
 Easeofuse.
 Highperformance
 Predictable,standardframework
 Understandable
 Extensibletoaccommodateunexpectednewdataelementsandnewdesign decisions

41. Whatismentbydimensionalmodeling?
Dimensional modeling isa logical design technique thatseeks to presentthe data in a
Standardframeworkthatintuitiveandallowsforhigh-
performanceaccess.ItisinherentlyDimensionalandadherestoadisciplinethatusestherelationalmodelwithsomeim
portant restrictions.

42. Whatcomprisesofadimensionalmodel?
Dimensionalmodeliscomposedofonetablewithamultipartkeycalledfacttableanda
setofsmallertablescalleddimensiontable.Eachdimensiontablehasasinglepartprimarykey that
correspondsexactly tooneof the components of multipart keyin the fact table.

43. Whatisadatamart?
Datamartisapragmaticcollectionofrelatedfacts,butdoesnothavetobeexhaustiveor

Exclusive. A data mart is both a kind of subject area and an application. Data mart is a
collectionofnumericfacts.

44. Whataretheadvantagesofadata-modelingtool?
 Integratesthedatawarehousemodelwithothercorporatedatamodels.
 Helpsassureconsistencyinnaming.
 Createsgooddocumentationinavarietyofusefulformats.
 Providesareasonablyintuitiveuserinterfaceforenteringcommentsaboutobjects.

45. Whatisdatawarehouseperformanceissue?
The performance of a data warehouse is largely a function of the quantity and typeof data stored
within a database and the query/data loading workload placed upon the system.

46. Whatarethetypesofperformanceissue?
 Capacityplanningforthedatawarehouse
 dataplacementtechniqueswithinadatawarehouse
 ApplicationPerformanceTechniques.
 MonitoringtheDataWarehouse.

47. Whydoyouneeddatawarehouselifecycleprocess?
Data warehouse life cycle approach is essential because it ensures that the project pieces
arebroughttogetherintherightorderandattherighttime.

www.AUNewsBlog.net
www.AUNewsBlog.net

48. Whatarethestepsinthelifecycleapproach?
 ProjectPlanning
 BusinessRequirementsdefinition
 Datatrack:Dimensionalmodeling,PhysicalDesign,DataStagingDesign&Development
 Technologytrack:TechnicalArchitecturedesign,ProductSelection&Installation
 Applicationtrack:EnduserApplicationSpecification,EnduserApplicationDevelopment
 Deployment
 Maintenance&Growth

49. ListthemeritsofDataWarehouse.
 Abilitytomakeeffectivedecisionsfromdatabase
 Betteranalysisofdataanddecisionsupport
 Discovertrendsandcorrelationsthatbenefitsbusiness
 Handlehugeamountofdata.

50. Whatarethecharacteristicsofdatawarehouse?
 Separate
 Available
 Integrated
 SubjectOriented
 NotDynamic
 Consistency
 IterativeDevelopment
 AggregationPerformance

51. ListsomeoftheDataWarehousetools?
 OLAP(OnlineAnalyticProcessing)
 ROLAP(RelationalOLAP)
 EndUserDataAccesstool
 AdHocQuerytool
 DataTransformationservices
 Replication

52. WhatisOLAP?
The general activity of querying and presenting text and number data from Data
Warehouses, as well as a specifically dimensional style of querying and presenting that is
exemplified by a number of “OLAP Vendors” .The OLAP vendors technology is no relational and is almost
always biased on an explicit multidimensional cube of data. LAP databases are also knownas multidimensional
cube of databases.

53. WhatisROLAP?
ROLAPisasetofuserinterfacesandapplicationsthatgivearelationaldatabasea dimensional
flavour. ROLAP stands for Relational Online Analytic Processing.

www.AUNewsBlog.net
www.AUNewsBlog.net

54. WhatistheneedforEndUserDataAccesstool?
End User Data Access tool is a client of the data warehouse. In a relational data warehouse,
such a client maintains a session with the presentation server, sending a stream of separate SQL requests
to the server. Evevtually the end user data access tool is done with the SQL session and turns around to
present a screen of data or a report, a graph, or some other
higherformofanalysistotheuser.AnenduserdataaccesstoolcanbeassimpleasanAdHoc query tool or canbe
complex as a sophisticated data mining or modeling application.

55. WhatismeantbyAdHocquerytool?
A specific kind of end userdata access toolthat invites the user to form their own queries
bydirectlymanipulatingrelationaltablesandtheirjoins.AdHocquerytools,aspowerfulasthey
are,canonlybeeffectivelyusedandunderstoodbyabout10%ofallthepotentialendusersofa data warehouse.

56. Namesomeofthedataminingapplications?
 DataminingforBiomedicalandDNAdataanalysis
 DataminingforFinancialdataanalysis
 DataminingfortheRetailindustry
 DataminingfortheTelecommunicationindustry

57. Namesomeofthedataminingapplications?
 DataminingforBiomedicalandDNAdataanalysis
 DataminingforFinancialdataanalysis
 DataminingfortheRetailindustry
 DataminingfortheTelecommunicationindustry

58. Differentiate“supervised”from“unsupervised”.
In data mining during classification the class label of each training sample is provided, this type of
training is called supervised learning (i.e.) the learning of the model is supervised in that it is
toldtowhichclasseachtrainingsamplebelongs.Eg.ClassificationInunsupervisedlearningthe
classlabelofeachtrainingsampleisnotknownandthememberorsetofclassestobelearned
maynotbeknowninadvance.Eg.Clustering.

59. Whyisdataqualitysoimportantinadatawarehouseenvironment?
Data quality is important in a data warehouse environment to facilitate decision-making. In order to
support decision-making, the stored data should provide information from a historical perspective and ina
summarized manner.

60. Howcandatavisualizationhelpindecision-making?
Data visualization helps the analyst gain intuition about the data being observed.
Visualization applications frequently assist the analyst in selecting display formats, viewer
perspectiveanddatarepresentationschemasthatfasterdeepintuitiveunderstandingthus facilitating
decision-making.

www.AUNewsBlog.net
www.AUNewsBlog.net

61. Whatdoyoumeanbyhighperformancedatamining?
Data mining refers to extracting or mining knowledge. It involves an integration of
techniques from multiple disciplines like database technology, statistics, machine learning, neural networks,
etc. When it involves techniques from high performance computing it is referred as high
performancedatamining.

62. Whatarethevariousdataminingissues?
 KnowledgeMining
 Userinteraction
 Performance
 Diversityindatatypes

63. What are the various data mining functionalities? The


data mining functionalities are:
 Conceptclassdescription
 Associationanalysis
 Classificationandprediction
 ClusterAnalysis
 OutlierAnalysis

64. Explain the different types of data repositories on which mining can be performed? The
different types of data repositories on which mining can be performedare:
 RelationalDatabases
 DataWarehouses
 TransactionalDatabases
 AdvancedDatabases
 Flatfiles
 WorldWideWeb

www.AUNewsBlog.net
www.AUNewsBlog.net

PART–B(16Marks)

1. Explainthearchitectureofdatawarehouse.
2. WhatisDataMining?ExplainthestepsinKnowledgeDiscovery?
3. Explainthedatapre-processingtechniquesindetail?ExplainthesmoothingTechniques?
4. ExplainDatatransformationindetail?
5. ExplainNormalizationindetail?
6. Explaindatareduction?
7. ExplainDataDiscriminationandConceptHierarchyGeneration?
8. ExplainStatisticalmeasuresindatabases?
9. Explainmultilevelassociationrule?
10. ExplainMultidimensionalDatabasebriefly?
11. Explainstar,snowflake,factconstellationschemaandDiagrams.
12. ExplainIndexingwithsuitableexamples?
13. ExplaintheBackPropagationtechnique?
14. ExplainPartitionMethods?
15. ExplainHierarchicalmethodofclassifications?
16. ExplainclassificationbyDecisiontreeinduction?
17. Explainthetypesofdatainclusteranalysis.
18. ExplainOutlieranalysis?
19. ExplainMiningcomplextypesofdata?
20. BrieflyexplainaboutDataMiningApplication?
21. Explainsocialimpactsofdatamining?
22. ExplainAdditionalthemesindatamining?

www.AUNewsBlog.net
www.AUNewsBlog.net

UNIT – III : EXTRACTION AND MINING COMMUNITIES IN WEB SOCIAL NETWORKS

PART–A:(2Marks)

1. WhatisaWebCommunity?
Awebcommunityisawebsite(orgroupofwebsites)wherespecificcontentorlinksare only available to
its members. A web community may take the form of a social network service, an Internet forum, a group of
blogs, or another kind of social softwareweb application.

2. HowaWebCommunitydoesdiffersfromacommunityofpeople?
An online community is a virtual community whose members interact with each other
primarilyviatheInternet.Formany,onlinecommunitiesmayfeellikehome,consistingofa“family of invisible
friends. An online community can act as an information system where members can
post,commentondiscussions,giveadviceorcollaborate.Commonly,peoplecommunicate through social
networking sites, chat rooms, forums, e-mail lists and discussion boards. People may also join online
communities through video games, blogs and virtual worlds.

3. HowisWebcommunityextracted?
Web Usage Mining is the application of data mining techniques to discover interesting usage
patterns fromWebdata inordertounderstandand betterserve theneedsof Web-based applications. Usage
data captures the identity or origin of Web users along with their browsing behaviorataWebsite.

4. Whatismeantbyvirtualcommunity?
Avirtualcommunityisasocialnetworkofindividualswhointeractthroughspecificsocialmedia,potent
iallycrossinggeographicalandpoliticalboundariesinordertopursuemutual
interestsorgoals.Someofthemostpervasivevirtualcommunitiesareonlinecommunitiesoperatingunder
socialnetworking services.

www.AUNewsBlog.net
www.AUNewsBlog.net

5. Whatisthepurposeofevolutionmetrics?
A system of measurement is a collectionof units of measurement and rules relating them to each
other. Systems of measurement have historically been important, regulated and defined for the purposes
ofscience and commerce. Systems of measurement in modern use include the metric system, the imperial
system, and United States customary units.

6. WhatattributesareusedtorepresenthowmanyURLsthefocusedcommunityobtainsorloses?
HTML5definesa<nav>menu,whichistobeusedtocontaintheprimarynavigationofa
website,beitalistoflinksoraformelementsuchasasearchbox.Thisisagoodidea,as previous to this we
would contain the navigation block inside something like<div id="navigation">.

7. Justifythestatement“TheWebisextremelydynamic”.
Tofacilitatethistaskwewouldappreciatethatthelargestamountofmeta-datawouldbe supplied along
withthe contents, specially.

 the web site address(es). If there are several web sites, please group the contents
belongingtoeachoneofthemonaseparatedirectory;
 the content addresses (URL). If you are providing a local copy of a site please maintain the
originalfilenames.Ifyouaresupplyingcontentsthatyougatheredfromthewebpleaseprovide
theiroriginalURLs;
 thecontentdates.Supplythedatewheneachcontentwaspublishedorsaved.Ifyoudo
notknowtheexactdates,pleasesupplyapproximatedates;
 the contentmediatype (MIME). Please maintaintheoriginalfilenameextensions ofthe
contents(e.g..gif,.html,.jpg).Ifpossible,providethefull HTTPheaderforeachcontent.Itis particularly
important to provide the media type for contents dynamically generated that do not contain file name
extensions.

8. WritenotesonWebCommunityCharts.
Gantt chart isa type of bar chart,devised by Henry Gantt in the 1910s, that illustratesa project
schedule. Gantt charts illustrate the start and finish dates of the terminal elements and summary elements
of a project. Terminal elements and summary elements comprise the workbreakdown structure of the
project.

www.AUNewsBlog.net
www.AUNewsBlog.net

9. Whatisthesizedistributionofcommunities?
Rank-size distribution is the distribution of size by rank, in decreasing order of size. For
example, if a data set consists of items of sizes 5, 100, 5, and 8, the rank-size distribution is 100, 8, 5, 5 (ranks 1
through 4). This is also known astherank-frequency distribution

10. Whatismeantbycommunitystructure?
Complexnetworks,anetworkissaidto have communitystructure ifthenodesofthe network can be
easily grouped into (potentiallyoverlapping) sets of nodes such that each set of nodes is densely connected
internally. In the particularcase of non-overlapping community finding, this implies that the network
dividesnaturally into groupsof nodes with dense connections
internallyandsparserconnectionsbetweengroups.

11. Givethesignificanceofcommunitydiscoveryinsocialnetworkanalysis.
The community detection in complex networks has attracted a growing interest and is the
subjectofseveralresearchesthathavebeenproposedtounderstandthenetworkstructureand analyze the
network properties.

12. Whataretheusesofcommunitydiscovery?
Discovering communities in a social network environment is graph partitioning problem, which
subdivides the entire graph into smaller partitions. Graph partitioning is believed as NP–
hardproblem,duetoitscomplexitytosplitthenumberofvertices.Weintroducedthemethodof mutual
accessibility to find communities in social networking environments. Existing work presents community
discovery from blog posts. In this research, we discovered community structures from
blogswhicharepostedbymobiledevicessuchasmobilephones,specializeddeviceslike personal digital
assistants (PDA).
13. Mentiontheadvantagesofhierarchicalalgorithms.
 Noaprioriinformationaboutthenumberofclustersrequired.
 Easytoimplementandgivesbestresultinsomecases.

www.AUNewsBlog.net
www.AUNewsBlog.net

14. Writenotesonspectralmethods.
Spectral methods are a class of techniques used in applied mathematics and scientificcomputing
to numerically solve certaindifferential equations, often involving the use of theFastFourier Transform.
The idea is to write the solution of the differential equation as a sum of certain "basis functions" (for example,
as aFourier series which is a sum of sinusoids) and then to choose the coefficients in the sum in order to satisfy
thedifferentialequation as wellas possible.

15. WhatisMarkovClustering?
The MCL algorithm is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster
algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs.

16. WhatistheobjectiveofKernighan-Lin(KL)algorithm?
Kernighan–Lin algorithm. This article is about the heuristic algorithm for the graphpartitioning problem.
For a heuristic for the traveling salesperson problem, see Lin–Kernighan heuristic. The Kernighan–Lin
algorithm is a heuristic algorithm for finding partitions of graphs.

17. Whatismeantbymodularity?
Modular programming is the process of subdividing a computer programinto separate sub- programs.
A module is a separate software component. It can often be used in a variety of
applicationsandfunctionswithothercomponentsofthesystem.

18. Differentiatebetweenagglomerativeanddivisiveclustering?
Hierarchical clustering algorithms are either top-down or bottom-up. Bottom-up algorithms treat
each document as a singleton cluster at the outset and then successively merge (or agglomerate) pairs of
clusters until all clusters have been merged into a single cluster that contains all documents.Bottom-
uphierarchicalclusteringisthereforecalledhierarchicalagglomerativeclusteringorHAC.Top-
downclusteringrequiresamethodforsplittingacluster.

www.AUNewsBlog.net
www.AUNewsBlog.net

19. WhatisaDendrogram?
Adendrogramisatreediagramfrequentlyusedtoillustratethearrangementoftheclusters produced by
hierarchical clustering.Dendrograms are often used in computational biology to
illustratetheclusteringofgenesorsamples,sometimesontopofheatmaps.

20. WhatisGirvanandNewman’sdivisivealgorithm.
TheGirvan–Newmanalgorithmdetectscommunitiesbyprogressivelyremovingedgesfrom the original
network. The connected components of the remaining network are the communities. Instead of trying to
construct a measure that tells us which edgesare the most central to communities, the Girvan–Newman algorithm
focuses on edges that are most likely "between" communities.

21. Writeshortnotesonmulti-levelgraphpartitioning.
Thegraphpartitionproblem isdefinedondatarepresentedintheform ofagraphG=(V,E),with V verticesand E
edges, such that it ispossible to partitionG into smaller componentswith specific properties.Forinstance,ak-
waypartitiondividesthevertexsetinto ksmallercomponents.Agood
partitionisdefinedasoneinwhichthenumberofedgesrunningbetweenseparatedcomponentsis small.

22. Whatisstochasticflow?
It is a known fact that solutions to a certain second order parabolic par- tial differential equation
are represented by means of a diffusion process or astochastic flow.

23. MentionthelimitationsofMarkovClustering.
TheMarkovClusterAlgorithm(MCL)(VanDon-gen,2000)iswell-recognizedasan
effectivemethodofgraphclustering.Itinvolveschangingthevaluesofatransitionmatrixtoward either 0 or 1 at
each step in a random walk until the stochastic condition is satisfied. When the hadamard power for each
transition probability value is divided by the sum of each column, the
rescalingprocessyieldsatransitionmatrixforthenextstage.

www.AUNewsBlog.net
www.AUNewsBlog.net

24. WhatisthepurposeofRegularizedMCL?
Markov clustering (MCL) has emerged as an effective algorithm for clustering biological
networks-for instance clustering protein-protein interaction (PPI) networks to identify functional
modules.However,alimitationofMCLanditsvariants(e.g.regularizedMCL)isthatitonly
supportshardclusteringoftenleadingtoanimpedancemismatchgiventhatthereisoftena
significantoverlapofproteinsacrossfunctionalmodules.

25. Whatareheterogeneoussocialnetworks?
Communityminingisoneofthemajordirectionsinsocialnetworkanalysis..........................However,in
reality, there exist multiple, heterogeneous social networks, each representing a particular kind of
relationship, and each kind of relationship may play a distinct role in a particular task.

26. Whatisensembleclustering?
The cluster ensemble problem is formulated as partitioning the hypergraph by cutting a minimal
number of hyperedges. They make use of hMETIS which is a hypergraph partitioning package system.

27. Whatisco-citationregularity?
Co-citation, like Bibliographic Coupling, is a semantic similarity measure for documents that
makes use of citation relationships. Co-citation is defined as the frequency with which two
documentsarecitedtogetherbyotherdocuments

28. Whatarethemethodsofinducingthegraph?
In graph theory, an induced subgraph of a graph is another graph, formed from a subset of the vertices
ofthe graphand all ofthe edges connecting pairs of vertices in that subset.
The induced subgraph isomorphism problem is a form of the subgraph isomorphism problem in whichthe goal is
to test whether one graph can be found as an induced subgraph of another. Because it
includesthecliqueproblemasaspecialcase,itisNP-complete.

www.AUNewsBlog.net
www.AUNewsBlog.net

PART–B(16Marks)

1. What is a Web Community? How will you extract the evolution of Web Community from a series ofWeb

Archives?

2. a.Discussthevariousevolutionmetrics.

b.Describethevariousdefinitionsofcommunity.

3. Describethecoremethodsofcommunitydiscoveryinsocialnetworks.

4. Writenoteson:

a. Localgraphclustering

b. Flow-BasedPost-ProcessingforImprovingCommunityDetection

c. CommunityDiscoveryviaShingling

d. Explainthequalityfunctiontoevaluatethecommunitystructure.

5. ExplaintheNodeClassificationproblem.

6. Discussthevariouslocalclassifierstosolvenodeclassificationproblem.

7. Describetherandomwalk-basedmethodsofnodeclassification.

8. Explainadsorptionmethodofnodeclassification.

9. Explainhowtoapplynodeclassificationtolargesocialnetworks.

10. Discusstheapplicationsofcommunityminingalgorithms.

www.AUNewsBlog.net
www.AUNewsBlog.net

UNIT – IV : PREDICTING HUMAN BEHAVIOUR AND PRIVACY ISSUES

PART–A:(2Marks)

1. WhatismeantbyevolutioninSocialNetworks?
Visualrepresentationofsocialnetworksisimportanttounderstandthenetworkdataand convey the
result of the analysis.Signed graphs can be used to illustrate good and bad relationships between
humans location-based interaction analysis, social sharing and filtering, recommender systems
development, and link prediction and entity resolution.

2. Whatisstreamparadigmofcomputation?
Stream processing is a computer programming paradigm, equivalent to dataflow programming, event
stream processing, and reactive programming, that allows some applications to more easily exploit a limited
form of parallel processing. Such applications can use multiple computational
units,suchastheFPUsonaGPUorfieldprogrammablegatearrays(FPGAs),withoutexplicitly managing
allocation, synchronization, or communication among those units.

3. Givethepurposeofstreamminingalgorithm.
Adatastreamisanorderedsequenceofinstancesthatinmanyapplicationsofdata stream mining
can be read only once or a small number of times using limited computing and storage capabilities.
Examples of data streams include computer network traffic, phone conversations, ATM transactions,
web searches, and sensor data. Data stream mining can be considered a subfield of data mining,machine
learning, and knowledge discovery.

4. Whatistheuseofslidingwindowinstreammining?
Finding frequent patterns in a continuous stream of transactions is critical for many applications
such as retail market data analysis, networkmonitoring, web usage mining, and stock marketprediction.

5. What are the two different threads of research on the analysis of dynamic social networks? Social and
temporal analysis methods.

www.AUNewsBlog.net
www.AUNewsBlog.net

6. Listthecharacteristicsofperennialobjects?
An object is made of tangible material (the pen is made of plastic, metal, ink).
Anobjectholdstogetherasasinglewhole(thewholepen,notafog).
An object has properties (the color of the pen, where it is, how thick it writes...).
Anobjectcandothingsandcanhave thingsdonetoit.

7. Howwillyoucomputetheentitysimilaritymatrix?
Theterm"cosinesimilarity"issometimesusedtorefertodifferentdefinitionofsimilarity
providedbelow. However the mostcommonuseof"cosine similarity" is asdefinedabove andthe similarity and
distance metrics definedbelow are referred to as "angular similarity" and "angular distance" respectively.
The normalized angle between the vectors is a formaldistance metric and can be calculated from the
similarity score defined above. This angular distance metric can then be used to compute a similarity function
bounded between 0 and 1, inclusive.

8. WhatisanEvolutionNet?
A social networkisa socialstructure made upof a set of social actorssets of dyadicties. The study of
these structures uses social network analysis to identify local and global.The Barabási model of network
evolution shown above is an example of a scale-free network and criminology.

9. Whatarethechallengingissuesin(dynamic)probabilisticmodeling?
Bayesian Networks assume a static model of the system which does not account for failure/repair
dynamics (i.e., the system state is assumed to be static during diagnosis process).In highly dynamic systems,
this is not the case. There is a need to expand a static Bayesian Network model into a dynamic Bayesian
Network model, in order to model situations where the node states changeover time.

www.AUNewsBlog.net
www.AUNewsBlog.net

10. Whatarethetworiskfunctionsofnon-parametricmethod?
Modelling the risk function non-parametrically, estimating it, for example, by a smoothing (thin plate)
spline is attractive as a more explorative approach. For prospective studies this amounts to smoothing
within the framework and distributional assumptions of generalized regression models (for binary
observations).Case-control studies as retrospective studies with exposure to risk
factorsbeingobserveddonotimmediatelyfitintothissetting.

11. Whatismeantbysocialinfluence?
Socialinfluenceoccurswhenone'semotions,opinions,orbehaviorsareaffectedbyothers. Social influence
takes many forms and can be seen in conformity, socialization, peer pressure,
obedience,leadership,persuasion,salesandmarketing.

12. Whatismeantbysocialcorrelation?
The correlation is one of the most common and most useful statistics. A correlation is a single number that
describes the degree of relationship between two variables. Let's work through an
exampletoshowyouhowthisstatisticiscomputed.

13. Whatismeantbytriadicclosure?
TriadicclosureisthepropertyamongthreenodesA,B,andC,suchthatifastrongtieexists betweenA-BandA-
C,thereisaweakorstrongtiebetweenB-C.

14. Whatisnode-basedcentrality?
Astarnetworkwith5nodesand4edges.Basedonthesethreefeatures,Freeman(1978)
formalized three different measures of node centrality: degree, closeness, and betweenness. Degree is the
number of nodes that a focal node is connected to, and measures the involvement of thenodeinthenetwork.

www.AUNewsBlog.net
www.AUNewsBlog.net

15. Whatismeantbykatzcentrality?
Katzcentralityofanodeisameasureofcentralityinanetwork.ItwasintroducedbyLeoKatz in 1953 and is used to
measure the relative degree of influenceof an actor (or node) within a
socialnetwork.Unliketypicalcentralitymeasureswhichconsideronlytheshortestpath(the geodesic)
betweenapairofactors,Katzcentralitymeasuresinfluencebytakingintoaccountthetotal numberof walks
between a pair of actors.

16. Whatissocialactiontracking?
Event tracking measures general user-interactions very well, Social Analytics provides a
consistent framework for recording social interactions. This in turn provides a consistent set of reports to
compare social network interactions across multiple networks.

17. WhatismeantbyLatentactionstate?
Latent functions are the unintended, unpredicted or unseen consequences that might arise
asaresultofcertainmanifestfunctionsthathavetakenplace.

18. Whatismeantbygroupingbehavior?
Agroupcanbedefinedastwoormoreinteractingandinterdependentindividualswhocome
togethertoachieveparticularobjectives.A groupbehaviorcanbestatedasacourseofactiona group takesasa
family. Forexample:Strike.

19. Whatismeantbydiffusioninfluencemodel?
A diffusion model attempts to replicate the temporal adoption of a new product as word of mouth
travels through the target population and external communications attempt to influence
demand.Asamplediffusionmodelworksheetwithagraphofprojectedadoptionappearsbelow.
Clickonsectionsoftheimagetolinktoexplanationsofitscontents.

www.AUNewsBlog.net
www.AUNewsBlog.net

20. StateExpertlocationproblem.
Humanexpertiseismorevaluablethancapital,meansofproductionorintellectualproperty. Contrary to
expertise, all other aspects of capitalism are now relatively generic: access to capital is global, as is access to
means of production for many areas of manufacturing. Intellectual
propertycanbesimilarlylicensed.Furthermore,expertisefindingisalsoakeyaspectofinstitutionalmemory,
aswithoutitsexpertsaninstitutioniseffectivelydecapitated.However,findingand “licensing” expertise, the
key to the effective use of these resources, remain much harder,starting
withtheveryfirststep:findingexpertisethatyoucantrust.

www.AUNewsBlog.net
www.AUNewsBlog.net

PART–B(16Marks)

1. a.Discussthefourdimensionsthatareassociatedtoknowledgediscoveryinsocialnetworksand

elaborateontheirinterplayinthecontextofevolution.

b.Discussthechallengesofsocialnetworkstreams.

2. Explain how communities evolve into the learning process as smoothly evolving constellations of

interactingentities.

3. Discussthevariousinfluencerelatedstatistics.

4. Explainbrieflysocialsimilarityandinfluence.

5. Describeinfluencemaximizationinviralmarketing.

6. Describeexpertlocationwithoutgraphconstraints.

7. Explainexpertlocationwithscorepropagation.

8. a.Describeindetailexpertscorepropagation.

b.Explainprobabilisticrelationalmodels

9. ExplainindetailBayesianprobabilisticmodels.

10. Describefeaturebasedlinkprediction.

www.AUNewsBlog.net
www.AUNewsBlog.net

UNIT – V : VISUALIZATIONAND APPLICATIONS OF SOCIAL NETWORKS

PART–A(2Marks)

1. Whatisvisualizationofonlinesocialnetworks?
Visualization system for playful end-user exploration and navigation of large scale online social networks.
Our design builds upon familiar node link network layouts to contribute customized techniques for
exploring connectivity in large graph structures, supporting visual search and analysis, and
automatically identifying and visualizing community structures.

2. Whatismeantbytaxonomyofvisualization?
A new, comprehensivetaxonomy of visualization techniques, drawing from the theories of Edward
Tufte and citing examples from academia, government, and the excellent NYT visualizationteam.
Thislist contains12 stepsforturningdata into acompelling visualization: Visualize, Filter, Sort, Derive,
Select, Navigate, Coordinate, Organize, Record, Annotate, Share, &Guide. 'For developers, the taxonomy can
function as a checklist of elements to consider when creating new analysis tools.' The citations alone make
this an article worth bookmarking."

3. Mentionthedifferenttypesofvisualization.
Therearetwobasictypesofvisualizationtechniques:
 Internalizing-visualizationpicturesinourmind'seye.
 Externalizing-visualizationpicturesoutsideofuswithoureye'sopen.

4. Whatarethetwoapproachestostructuralvisualization?
A very common approach to structural visualizationis to guide the visualization process by
underlyingprogrammingstylesorcomputationalmodels.

5. Statethepurposeofvisualization.
Based on (non-visual) data. A visualization's purpose is the communication of data. That means that
the data must come from something that is abstract or at least not immediately visible (like the inside of the
human body).

www.AUNewsBlog.net
www.AUNewsBlog.net

6. Whatismeantbyproximityofnodes?
Detailed level (i.e., node level) of link analysis, we want to figure out the relationship between
twonodesonthegraph,such asproximity,association,correlationandcausality.Forproximity, the goal is to
measure the closeness (a.k.a, relevance, or similarity) between two nodes.

7. Whatarethevariouslayoutalgorithms?
 force-basedlayoutsystems
 Spectrallayout
 Orthogonallayout

8. Givethesignificanceofgraphlayoutalgorithm?
In mathematics graph theory is the study of graphs, which are mathematical structures used to model
pairwise relations between objects. Agraph in this context is made up of vertices, nodes, or
pointswhichareconnectedbyedges,arcs,orlines.

9. Writeshortnotesonnode-edgediagrams.
Node-edgediagramsbasedtechniquesaswellasspace-filling approachesincorporatethe focus and
context concept.

10. Writenotesonmatrix-orientedtechniques.
In geometry the orientation, angular position, or attitude of an object such as a line,plane or
rigidbodyispartofthedescriptionofhowitisplacedinthe spaceitisin.[1]Namely,itisthe imaginary rotation
that is needed to move the object from a reference placement to its current
placement.Arotationmaynotbeenoughtoreachthecurrentplacement.Itmaybenecessaryto
addanimaginarytranslation,calledtheobject'slocation(orposition,orlinearposition).The location and
orientation together fully describe how the object is placed in space.

11. WriteshortnotesonWebCommunities.
A web community is a web site (or group of web sites) where specific content or links are only available
to its members. A web community may take the form of a social network service, anInternet
forum,agroupofblogs,oranotherkindofsocialsoftwarewebapplication.

www.AUNewsBlog.net
www.AUNewsBlog.net

12. Whataredigitallibraries?
A digital library is a special library with a focused collection of digital objects that can include text, visual
material, audio material, video material, stored as electronic media formats (as opposed to print, microform, or
other media), along with means for organizing, storing, and retrieving the files and media

13. WhatdoyoumeanbyContent-centricvisualization?
In contrast to IP-based, host-oriented,Internet architecture, content centric networking (CCN) emphasizes
content by making it directly addressable and routable. Endpoints communicate based on named data instead of IP
addresses.

14. WhatisthepurposeofUser-centricvisualization?
The need to understand and track files (and inherently, data) in cloud computing systems is in high
demand. Over the past years, the use of logs and data representation using graphs have become the main method
for tracking and relating information to the cloud users.

15. Definesemanticvisualization.
Visualization can support effective and efficientinteraction with a range of information for a varietyof tasks.

16. Whatismeantbyontologyengineering?
Ontology engineering in computer science and information science is a field which studies the
methodsandmethodologiesforbuildingontologies:formalrepresentationsofasetofconceptswithin a domain and the
relationships between those concepts. A large-scale representation of abstract concepts such as actions, time,
physical objects and beliefs would be an example of ontological engineering.

www.AUNewsBlog.net
www.AUNewsBlog.net

17. Whatisasemanticsubstrate?
A semantic wiki is a wiki that has an underlying model of the knowledge described in its pages. Regular,
or syntactic, wikis have structured text and untyped hyperlinks. Semantic wikis, on the
otherhand,providetheabilitytocapture oridentifyinformationaboutthedatawithinpages,and
therelationshipsbetweenpages,in waysthat can bequeriedor exported likeadatabasethrough semantic
queries.

18. Whatismeantbydatavisualization?
Datavisualizationordatavisualisationisviewedbymanydisciplinesasamodernequivalent of visual
communication. It involves the creation and study of the visual representation of data, meaning
"information that has been abstracted in some schematic form, including attributes or variables for the
units of information".

19. Whatisthepurposeofontologymapping?
Ontologymappingmayreferto:

 Semanticintegration
 Ontologyalignment

20. Whatismeantbysemanticintegration?
Semanticintegration istheprocessofinterrelatinginformationfromdiversesources,for example
calendars and to do lists, email archives, presence information (physical, psychological, and social),
documents of all sorts, contacts (includingsocial graphs), search results, and
advertisingandmarketingrelevancederivedfromthem.Inthisregard,semanticsfocusesonthe organization of
and actionuponinformation by actingas an intermediary between heterogeneous data sources, which may
conflict not onlyby structure but also context or value.

21. Whatismeantbyontologyalignment?
Ontology alignment, or ontology matching, is the process of determining correspondences
betweenconcepts.Asetofcorrespondencesisalsocalledanalignment.Thephrasetakesona slightly different
meaning, in computer science,cognitive science or philosophy.

www.AUNewsBlog.net
www.AUNewsBlog.net

PART–B(16Marks)

1. Whatisvisualization?ExplainSocialNetworkvisualizationontheWeb.

2. Discussthetaxonomyofvisualizationsofsocialnetworks.

3. Explainthefollowing:

a. Clustering

b. Centrality

c. Node-linkdiagrams

4. ExplaintheNode-edgediagramstovisualizesocialnetworks.

5. Explain how to visualize social networks with matrix-based representation. Also discuss the pros and

consof matrix-based representation.

6. Discuss the various approaches to scale node-link diagrams to large networks with several thousand

or millions of nodes.

7. Brieflyexplainthehybridrepresentationofvisualization.

8. Brieflyexplaintheconceptofmodelingandaggregatingsocialnetworkdata.

9. Explain how clustering is performed with random walk based measures. Also discuss the

algorithmsforcomputingproximitymeasures.

10. a)Discusstheapplicationsofrandomwalksapproach.

b)BrieflyexplaintheuseofHadoopandMapReduce002E

www.AUNewsBlog.net

You might also like