0% found this document useful (0 votes)
4 views3 pages

DATA M EXAMS Programation 2

The assignment requires students to write code for two problems without using additional packages or modifying provided APIs. Problem 1 involves calculating statistical attributes for a specified column in a dataset, while Problem 2 focuses on generating a scatter plot of two specific columns. Students must adhere to given data paths and utilize provided unit tests for validation.

Uploaded by

walidboutaghou39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

DATA M EXAMS Programation 2

The assignment requires students to write code for two problems without using additional packages or modifying provided APIs. Problem 1 involves calculating statistical attributes for a specified column in a dataset, while Problem 2 focuses on generating a scatter plot of two specific columns. Students must adhere to given data paths and utilize provided unit tests for validation.

Uploaded by

walidboutaghou39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment Instructions

Remember that you are encouraged to discuss the problems with your instructors and classmates, but you must write
all code and solutions on your own.
The rules to be followed for the assignment are:
• Do NOT load additional packages beyond what we’ve shared in the cells below.
• Some problems with code may be autograded. If we provide a function or class API do not change it.
• Do not change the location of the data or data directory. Use only relative paths to access the data.

import argparse
import pandas as pd
import numpy as np
import pickle
from pathlib import Path

Problem 1 - [10 points]


The Function below should return the following attributes for the ith column:
• Number of objects
• The minimum value
• The maximum value
• The mean value
• The standard deviation value
• The Q1 value
• The median value
• The Q3 value
• The IQR value
Note:
• A sample dataset to test your code has been provided in the location "data/dataset.csv". Please maintain this as it
would be necessary while grading.
• Do not change the variable names of the returned values.
• After calculating each of those values, assign them to the corresponding value that is being returned.
• The ithAttribute value can range from 0 - 10

def calculate ( dataFile , col_num ) :


"""
Input Parameters :
dataFile : The dataset file .
ithAttre : The ith attribute for which the various properties must be calculated .

Default value of 0 , infinity , - infinity are assigned to all the variables as required .
"""
numObj , minValue , maxValue , mean , stdev , Q1 , median , Q3 , IQR = [0 , " inf " ," - inf "
,0 ,0 ,0 ,0 ,0 ,0]
# YOUR TASK : Write code to assign the values to the respective variables .

# your code here

return numObj , minValue , maxValue , mean , stdev , Q1 , median , Q3 , IQR

1
This cell has hidden test cases that will run after you submit your assignment .
You can troubleshoot using the unit tests we shared below .

Unit Tests:

import unittest

class TestKnn ( unittest . TestCase ) :


def setUp ( self ) :
self . loc = " data / dataset . csv "
file = open ( ’ data / testing ’ , ’ rb ’)
self . data = pickle . load ( file )
file . close ()

def test@ ( self ) :


"""
Test the label counter
"""
self . Column = self . data [0]
result = calculate ( self . loc , self . Column )
self . assertEqual ( result [0] , self . data [1][0])
self . assertEqual ( result [1] , self . data [1][1] , places = 3)
self . assertEqual ( result [2] , self . data [1][2] , places = 3)
self . assertEqual ( result [3] , self . data [1][3] , places = 3)
self . assertEqual ( result [4] , self . data [1][4] , places = 3)
self . assertEqual ( result [5] , self . data [1][5] , places = 3)
self . assertEqual ( result [6] , self . data [1][6] , places = 3)
self . assertEqual ( result [7] , self . data [1][7] , places = 3)
self . assertEqual ( result [8] , self . data [1][8] , places = 3)

tests = TestKnn ()
tests_to_run = unittest . TestLoader () . loadTestsFromModule ( tests )
unittest . TextTestRunner () . run ( tests_to_run )

Problem 2 - [10 Points]


The helper function func() below will be used to help generate a scatter plot of the columns CO on the x-axis and AFDP
on the y-axis and should return the following:
• Return the x values, y values, title, x-label and the y-label
Notes:
• The dataset is available in "./data/dataset.csv"

• Hidden tests are run against x and y values. These should be array-like objects (list or series).

# Scatter plot of columns with attributes CO on x - axis and AFDP on y - axis


# Return the x values , y values , title , x - label and the y - label
# The dataset is available in "./ data / dataset . csv "
import pandas as pd
import numpy as np
import matplotlib . pyplot as plt

def func () :

’’’
Output : x , y , title , x - label , y - label
’’’
x = []
y = []
title = ’ ’

2
x_label = ’ ’
y_label = ’ ’

# your code here

return x , y , title , x_label , y_label

The cell below can be used to test our scatter plot . You don ’ t need to modify this . Simply execute

Once the func () function above has been completed we can test it by running the cell below . Running

deep7.PNG

# Testing the func () function


x , y , title , x_label , y_label = func ()
plt . scatter (x , y )
plt . title ( title )
plt . xlabel ( x_label )
plt . ylabel ( y_label )

This cell has hidden test cases that will run after you submit your assignment .
You can troubleshoot by calling the function and checking return types and values .

You might also like