0% found this document useful (0 votes)
6 views

DataScience ExpertRating

The document contains a series of questions and answers related to various topics in computer science, data science, and programming. Each question presents multiple-choice options, with the correct answer indicated for each. Topics include IPython functions, distributed file systems, graph properties, data visualization, and more.

Uploaded by

evisa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

DataScience ExpertRating

The document contains a series of questions and answers related to various topics in computer science, data science, and programming. Each question presents multiple-choice options, with the correct answer indicated for each. Topics include IPython functions, distributed file systems, graph properties, data visualization, and more.

Uploaded by

evisa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1.

**IPython Function for Clipboard Content:**


- Which of the following options is the correct IPython function that is correctly used to paste
the content of the clipboard?
- [ ] a. $paste
- [ ] b. @paste
- [ ] c. %paste
- [ ] d. &paste

Final answer: c. %paste

2. **Distributed File System (DFS):**


- Which of the following statements is NOT correct about the distributed file system (DFS)?
- a. In DFS, the files are automatically replicated across multiple servers for redundancy.
- b. DFS systems scales easily.
- c. They cannot store files larger than any one computer disk.
- d. DFS can provide security to files similar to a normal file system.

Final answer: c. They cannot store files larger than any one computer disk.

3. **Phase Transition in Graph Properties:**


- In relation to phase transition for increasing properties, for every increasing property Q of
G(n, p) there is a phase transition at p(n). For every n, p(n) is the minimum real number \( a_n \).
Find the probability that G(n, a_n) has the property Q.
- [ ] a. 1
- [ ] b. 1/4
- [ ] c. 1/6
- [ ] d. 1/2

Final answer: d. 1/2

4. **Key-Value Stores:**
- Which of the following databases are the examples of key-value stores?
- [ ] a. Redis
- [ ] b. CouchDB
- [ ] c. MongoDB
- [ ] d. Riak

Final answer: a. Redis and d. Riak

5. **Hadoop Components:**
- Which of the following Hadoop components is used for managing the cluster resources?
- [ ] a. HDFS
- [ ] b. YARN
- [ ] c. MapReduce
- [ ] d. Hive

Final answer: b. YARN

6. **Cycles in Graphs:**
- In relation to cycles and full connectivity of graphs, what is the threshold for the existence of
cycles in G(n, p)?
- ○ a. \( p = 1/2n \)
- ○ b. \( p = \sqrt{n} \)
- ○ c. \( p = 1/n \)
- ○ d. \( p = 1/\sqrt{n} \)

Final answer: c. \( p = 1/n \)

7. **Markov's Inequality:**
- In relation to Markov's inequality, if \( x \) is a non-negative random variable, then which of
the following statements is correct, when \( a > 0 \)?
- [ ] a. Prob\( (x \geq a) \geq E(x)/a \)
- [ ] b. Prob\( (x \geq a) \leq E(x)/a \)
- [ ] c. Prob\( (x \leq a) \leq E(x)/a \)
- [ ] d. Prob\( (x = a) > E(x)/a \)

Final answer: b. Prob\( (x \geq a) \leq E(x)/a \)

8. **Left Singular Vectors:**


- Choose True or False.
- The left singular vectors are pairwise orthogonal.
- [ ] a. True
- [ ] b. False

Final answer: a. True

9. **Spherical Gaussian Fitting:**


- Considering the fitting of a spherical Gaussian to data, let \((x_1, x_2, ..., x_n)\) be a set of
n-dimensional points. \((x_1 - \mu)^2 + (x_2 - \mu)^2 + ...... + (x_n - \mu)^2\) is minimized under
the condition, if \(\mu\) is the centroid of the points \(x_1, x_2, ..., x_n\), namely \(\mu =\)
______.
- [ ] a. \((x_1 + x_2 + ....x_n)/n\)
- [ ] b. \(n(x_1 + x_2 + ....x_n)\)
- [ ] c. \((x_1 + x_2 + ....x_n)/2n\)
- [ ] d. \((x_1 + x_2 + ....x_n)/n^2\)

Final answer: a. \((x_1 + x_2 + ....x_n)/n\)


10. **Data Visualization Functions:**
- In relation to data visualization, which of following functions is used for describing what
happens on adding an extra observation?
- [ ] a. reducesum()
- [ ] b. reduceadd()
- [ ] c. reduceRemove()
- [ ] d. reducelnit()

Final answer: c. reduceRemove()

11. **Data Visualization Interactivity:**


- In relation to data visualization, which of the following options is used for handling the
interactivity?
- [ ] a. JQuery
- [ ] b. d3.js
- [ ] c. Bootstrap
- [ ] d. HTML5

Final answer: b. d3.js

12. **Graph Databases - Edges:**


- In relation to Graph databases, which of the following statements is NOT correct about an
edge?
- [ ] a. It represents a relationship between two entities.
- [ ] b. It has its own properties.
- [ ] c. It can never have a direction.
- [ ] d. It is represented using a line.

Final answer: c. It can never have a direction.

13. **Data Visualization - reduceInit() Function:**


- In relation to data visualization reduceInit() function, what is the most logical starting point
for a sum and count?
- [ ] a. -1
- [ ] b. 0
- [ ] c. 1
- [ ] d. 2

Final answer: b. 0

14. **Data Science Clustering:**


- Fill in the blank with the most suitable option.
- In relation to data science clustering, for a k–clustering of radius r/2, the farthest traversal
k-clustering algorithm can find a k-clustering with radius at most ______.
- ○ a. r/4
- ○ b. 2r
- ○ c. r²
- ○ d. r

Final answer: b. 2r

15. **Law of Higher Moments:**


- In relation to law of higher moments, if \( r \) is a positive even integer, then which of the
following statements is correct?
- [ ] a. Prob \((|x| \geq a) \leq E(x^r)/a^r\)
- [ ] b. Prob \((|x| \geq a) \leq E(x^2)/a^{2r}\)
- [ ] c. Prob \((|x| \geq a \geq E(a)) \geq E(x)/a^r\)
- [ ] d. Prob \((|x| \geq a) = E(x^r)/a^2\)

Final answer: a. Prob \((|x| \geq a) \leq E(x^r)/a^r\)

16. **Text Mining - Lemmatization:**


- In relation to text mining technique **lemmatization**, which of the following POS (Part of
Speech) tag means adjective and comparative?
- ○ a. CC
- ○ b. CD
- ○ c. JJR
- ○ d. NNS

Final answer: c. JJR

17. **R Commands - Installing quantmod Package:**


- Which of the following R commands is used for installing the quantmod package?
- [ ] a. package . install ("quantmod")
- [ ] b. install . new. packages ('quantmod')
- [ ] c. install . packages ("quantmod")
- [ ] d. new . package. install ('quantmod')

Final answer: c. install . packages ("quantmod")

18. **Data Science Modeling - ARCH Models:**


- In relation to data science modeling using \( R \), which of the following statements is correct
about the ARCH models?
- [ ] a. It states that volatility for a time period t is auto-correlated with volatility from period (t
+ 1), or more exceeding periods.
- [ ] b. It states that volatility for a time period t is auto-correlated with volatility from period (\(
t^2 \)), or more preceding periods.
- [ ] c. It states that volatility for a time period t is auto-correlated with volatility from period (t
– 1), or more preceding periods.
- [ ] d. It states that volatility for a time period t is auto-correlated with volatility from period (\(
t^2 \)), or more exceeding periods.

Final answer: c. It states that volatility for a time period t is auto-correlated with volatility from
period (t – 1), or more preceding periods.

19. **Hadoop Commands - Directory Access:**


- Suppose you created a new directory named "Example1" in Hadoop. Which of the following
Hadoop commands is used for giving everyone access to this directory?
- [ ] a. sudo -u hdfs hadoop fs –chmod 777 /Example1
- [ ] b. sudo -u hadoop hdfs fs –chmod 777 /Example1
- [ ] c. sudo -a hadoop hdfs –chmod 777 /Example1
- [ ] d. sudo –a hadoop hdfs –chmod 777/fs /Example1

Final answer: a. sudo -u hdfs hadoop fs –chmod 777 /Example1

20. **Vagrantfile - Connection Timeout:**


- In order to find out what's incorrect, which of the following code snippet should be added to
the Vagrantfile before the last end statement, when the message default: Warning: Connection
time out. Retrying... is printed repeatedly?
- a. config_new.provider "vm" do lvb!
vb.gui = true
end
- b. config_new.vm.provider "virtualbox" do lvb!
vb.gui = false
end
- c. config.vm.provider "virtualbox" do lvb!
vb.gui = true
end
- d. config.virtualbox.provider "virtualbox" do lvm!
vm.gui = false
End

Final answer: c. config.vm.provider "virtualbox" do lvb!


vb.gui = true
end

21. **Singular Value Decomposition - Directed Graph:**


- Fill in the blank with the correct option.
- In relation to singular value decomposition, consider a directed graph G(V, E). For this
graph, a cut of size at least the maximum cut minus ______ can be computed in time
polynomial in n for any fixed k.
- ○ **a.** O(v/n/k)
- ○ **b.** O(n^2/k)
- ○ **c.** O(v/n/k^2)
- ○ **d.** O(2n^2/k)

Final answer: b. O(n^2/k)

22. **Python Function Definition:**


- Which of the following options is used in Python for defining the functions?
- [ ] a. define
- [ ] b. def
- [ ] c. void
- [ ] d. func

Final answer: b. def

23. **Operations Optimization - Large Data Structures:**


- In relation to operations optimization, which of the following options is used for providing
data structures that can be larger than the computer's main memory, thus enabling the user to
work with the larger data sets?
- [ ] a. PP
- [ ] b. Dispy
- [ ] c. PyCUDA
- [ ] d. Blaze

Final answer: d. Blaze

24. **Data Visualization - MapReduce Library:**


- In relation to data visualization, which of the following options is a MapReduce library and a
prerequisite to dc.js?
- ○ **a.** Crossfilter.js
- ○ **b.** d3.js
- ○ **c.** JQuery
- ○ **d.** Bootstrap

Final answer: a. Crossfilter.js

25. **Python List Concatenation:**


- Which of the following options is the correct Python syntax for concatenating values 8, 9,
and 10 to an already existing list named a with the values 5, 6, and 7?
- [ ] a. \( a = [5, 6, 7] \)
\( a + ([8, 9, 10]) \)
- [ ] b. \( a = [5, 6, 7] \)
\( a.add([8, 9, 10]) \)
- [ ] c. \( a = [5, 6, 7] \)
\( a.extend([8, 9, 10]) \)
- [ ] d. \( a = [5, 6, 7] \)
\( a++([8, 9, 10]) \)

Final answer: c. \( a = [5, 6, 7] \)


\( a.extend([8, 9, 10]) \)

26. **Hadoop File System Command - Moving a File:**


- Which of the following options is the correct syntax of the Hadoop file system command
used for moving a file?
- [ ] a. hadoop fs –mv -m OLDURI NEWURI
- [ ] b. hadoop fs hdfs –mv OLDURI NEWURI
- [ ] c. hadoop fs –mv OLDURI NEWURI
- [ ] d. hadoop fs –m OLDURI to NEWURI

Final answer: c. hadoop fs –mv OLDURI NEWURI

27. **Maximum Likelihood Estimate (MLE):**


- Select the correct statement with "Maximum Likelihood estimate (MLE)"?
1. MLE may not always exist
2. If MLE exist, they may not be unique
3. MLE always exists
4. If MLE exist, they must be unique
- [ ] a. 1 and 4
- [ ] b. 2 and 3
- [ ] c. 1 and 3
- [ ] d. 2 and 1

Final answer: d. 2 and 1

28. **Spark Components - Real-Time Analysis:**


- Which of the following Spark components is used for real-time analysis?
- a. MLLib
- b. GraphX
- c. Spark SQL
- d. Spark streaming

Final answer: d. Spark streaming

29. **Preliminaries - Projection of a Point:**


- In relation to preliminaries, if a point \( a_i \) is projected as \((a_{i1}, a_{i2}, ..., a_{id})\) onto
a line through the origin, then what is the value of \( a_{i1}^2 + a_{i2}^2 + ... + a_{id}^2 \)?
- [ ] a. (length of projection)\(^2\) x (distance of point to line)\(^2\)
- [ ] b. (length of projection)\(^2\)/(distance of point to line)\(^2\)
- [ ] c. (length of projection)\(^2\) + (distance of point to line)\(^2\)
- [ ] d. 2(length of projection)\(^2\) x (distance of point to line)\(^2\)

Final answer: c. (length of projection)\(^2\) + (distance of point to line)\(^2\)

Let me continue with the remaining questions:

30. **Data Science vs. Machine Learning:**


- Which of the following statements is/are the correct differences between data science and
machine learning?
- [ ] a. Data science is multi-disciplinary while machine learning is only concerned with
training machines.
- [ ] b. Data science is tightly integrated while machine learning is loosely integrated.
- [ ] c. Data science can take on a business role while machine learning is purely technical in
nature.
- [ ] d. Both options a and b.
- [ ] e. Both options a and c.

Final answer: e. Both options a and c.

31. **Python - Statement Continuation:**


- In Python, which of the following options is used for indicating that a statement continues
onto the next line?
- ○ **a.**
- ○ **b.**
- ○ **c.**
- ○ **d.**

Final answer: c. \

32. **Spark - Correct Statements:**


- Which of the following statements are correct about Spark?
- [ ] a. Spark can handle storage of files on the distributed file system.
- [ ] b. Spark cannot handle resource management.
- [ ] c. Spark can be run on the user's local system for testing and development.
- [ ] d. Spark cannot support Python.

Final answer: c. Spark can be run on the user's local system for testing and development.

33. **Data Science Clustering - Matrix Entries:**


- In relation to data science clustering, if \( A \) is an \( n \times d \) matrix with entries
between 0 and 1, then
\[\sigma_1(A) \geq d(A) \geq ___?\]
- a.
\[\sigma_1(A)/6 \log n^2 \log d\]
- b.
\[\sigma_1(A)/4 \log n \log d\]
- c.
\[\sigma_1(A)/4 \log n \log d^2\]
- d.
\[\sigma_1(A)/2 \log n \log d\]

Final answer: b. \[\sigma_1(A)/4 \log n \log d\]

34. **Data Science Toolbox - Shutting Down:**


- Which of the following options is the correct command that is executed for shutting down the
data science toolbox?
- [ ] a. $ vagrant destroy
- [ ] b. $ vagrant finish
- [ ] c. $ vagrant halt
- [ ] d. $ vagrant end

Final answer: c. $ vagrant halt

35. **Heteroskedasticity:**
- Select the Correct Statement about Heteroskedasticity?
- [ ] a. Linear Regression having varying error terms
- [ ] b. Linear Regression having constant error terms
- [ ] c. Linear Regression having zero error terms
- [ ] d. None of these

Final answer: a. Linear Regression having varying error terms

36. **Python Packages - Text Mining:**


- Which of the following Python packages/tools is used for text mining?
- [ ] a. SQLite3
- [ ] b. Matplotlib
- [ ] c. PRAW
- [ ] d. NLTK

Final answer: d. NLTK

37. **Data Science - Text Analytics:**


- In relation to data science, which of the following languages is considered more suitable for
text analytics?
- ○ a. C
- ○ b. Java
- ○ c. R
- ○ d. Python

Final answer: d. Python

38. **Python Tools - Array and Linear Algebra Functions:**


- In relation to Python tools, which of the following packages is used for providing access to
array functions and linear algebra functions?
- [ ] a. SciPy
- [ ] b. NumPy
- [ ] c. NumbaPro
- [ ] d. RPy2

Final answer: b. NumPy

39. **Random Graph - Vertex Degree Probability:**


- In the given image, if \( v \) is a vertex of the random graph \( G(n, p) \) and \( a \) is a real
number in (0, √(np)), then which of the following options is correct?
- ○ **a.** Option i)
- ○ **b.** Option ii)
- ○ **c.** Option iii)
- ○ **d.** Option iv)

Final answer: a) Option i)

40. **Python - Random Sampling with Replacement:**


- In Python, calling which of the following methods will result in randomly choosing a sample
of elements with replacement?
- [ ] a. random.choice
- [ ] b. random.sample
- [ ] c. Either option a or b.
- [ ] d. Neither option a nor b.

Final answer: a. random.choice

You might also like