CW2 REF Specification CSI-4-DSA 2223
CW2 REF Specification CSI-4-DSA 2223
CW2_REF_Specification_CSI-4-DSA_22/23
Read this coursework specification carefully, it tells you how you are going to be assessed,
how to submit your coursework on-time and how (and when) you’ll receive your marks and
feedback.
Coursework Aim:
Complete a programming task involving the use of the several data structures that
have been explored in this module (see Assignment Task, below).
1
Coursework Details:
Type: Report
Word Count: No formal word count is required but for general
guidance no more than 2000 words would be
expected. The word count will always be ambiguous
because the requirement for embedded code will
skew it. There are no specific penalties for exceeding
or failing to reach the word count guidance, but your
submission must properly address the task in hand.
2
Learning Outcomes
This coursework will fully or partially assess the following learning outcomes for this module.
Describe different algorithms and the means used to measure their performance.
Analyse programming problems and identify appropriate algorithmic solutions.
Develop software to solve relatively complex problems and assess alternative
solutions.
Apply critical and analytic reasoning to tasks.
Assignment Task
You are provided with a Java application that loads data from a sequence of generated data
files and performs some processing of the contents of the data. The data consists of random
coordinates consisting of two integer values (for example, "25634:65432") and the program
needs to eliminate duplicate pairs from the incoming set (that is, if "25634:65432" appears
more than once, in the data, the created set should have it only once). The provided version
is incomplete and you need to add some code to make it work.
Provided files:
StudentWork.java - this is the main program that you can run. You should not modify the
contents of the main method in this class (except as mentioned below, to temporarily turn off
some of the functionality), but the other methods in this class are where you need to make
your changes.
DataGenerator.java - this class does not need to be changed to complete the assignment.
It is a class that generates the files that are analysed.
charts.xlsx - this is an excel spreadsheet with a space to paste the output from the program
into that is linked to two charts that will display the results you obtained. You can use this to
help you in your discussion of the performance characteristics of the program, to contrast
theory and the observed behaviour. The charts in this spreadsheet are preconfigured with
suitable trendlines and do not need any adjustments to use.
You should create a new IDE project for this assignment and create a Java package called
dsa.ref2223 (that is, a package called ref2223 inside a package called dsa) and add all the
provided classes to it. Alternatively, add the provided classes to a package of your choice
and correct the package declaration line at the top of each file to match the name of your
chosen package.
Task 1
You must add code to the StudentWork class in this program so that it creates a collection of
all the unique coordinates found in the input data. To do this you must examine each
coordinate read from the file and make a collection that contains each unique coordinate
only once.
Please note that you are given a working implementation that does this inefficiently
using a list and you can use this code as the basis of your more efficient versions.
3
The program is designed to do this in three different ways, in three different methods
provided for the purpose. The first of these methods is complete and working and provided
for you as an example. The other two methods you must complete yourself, but they will be
quite similar to the first in many regards.
1. The method distinctCoordsUsingList(List<Coordinate> input) uses an ArrayList that is
initialised to contain enough space so that it does not need resizing (the same size
as the input data) and then checks whether it contains each pair already, and if not
adds it.
2. In the method distinctCoordsUsingTreeSet(List<Coordinate> input) you must use a
TreeSet that will automatically ignore attempts to add duplicate objects, so you can
just add all the pairs to it one after another.
3. In the method distinctCoordsUsingHashSet(List<Coordinate> input) you must use a
HashSet that will also automatically ignore attempts to add duplicate objects, so you
can just add all the pairs to it one after another.
Important: The Coordinates in the input data are safe to use as a value in both a HashSet
and TreeSet, and safe to use as a key in both a HashMap and TreeMap.
The existing main method generates a set of the data files in the current directory and for
each file, calls each of these methods in turn. For each call it uses
System.currentTimeMillis() to record the time before and after calling the method so that it
can then display the length of time taken as well as the number of unique pairs that have
been found. The inputs are processed in increasing order of size (number of coordinates).
All three methods should find the same number of coordinates - if not, they are not working
properly.
You can run the program as it stands, but it will only use the distinctCoordsUsingList method
to find the number of unique coordinates. This method is very slow and the program will
take a considerable time to run.
Once the program completes, it will print out a little summary of the times taken for each
method.
Completing the distinctCoordsUsingTreeSet and distinctCoordsUsingHashSet methods does
not require a lot of complicated code. They will be very similar to the existing
distinctPairrsUsingList and even more similar to each other. They will also run much, much
faster than the existing method.
While developing your code for these methods, you may want to change the boolean
variable found in the main method called suppressListCalculation to have the value true.
This causes the program to skip doing the slow list calculation and will help you to test your
new code without waiting for that to be done. Other than this, you should not change any of
the code in the main method.
You need to explain how your code works and discuss the different times that your program
reports; explaining the reasons behind the differences in timing in terms of the concepts we
have studied in this module. You should copy and paste the summary outputs from your
program into the green shaded region of the charts.xlsx spreadsheet to see the results
plotted in the graphs there; this may help in your discussion of the performance of the
different methods. Note that you must use the "Paste Special" option and choose
"Unicode Text" as the paste type to have the data correctly paste into the rows and
4
columns. For each of the charts you should comment on what kind of performance curve
you expect to see, based on the underlying data structure being used, and what kind of
curve you actually see. Be aware that many factors such as runtime compiler optimisations,
competition for CPU time with other processes, operating system memory management and
the number of unique coordinates actually present in the different input sets will obscure the
theoretical performance of the data structures involved. Therefore the results you might
expect on the basis of theory may not match the actual observations - discussing any
inconsistencies would be worth significant credit even if you cannot explain it.
Task 2
You need to provide an implementation of the method
frequencyCountUsingMap(List<Coordinate> input) so that it returns a map which has the
distinct coordinates in the image as the keys and the number of times they occur as the
values. The main method also displays the time this method takes and you should
experiment with implementations based on both TreeMap and HashMap and discuss the
code for each approach and the difference of the results. Once this method has been
implemented, for the last input processed (only) the program displays the frequency map
counts for all coordinates that occur more than once. In general there will only be a small
number of duplicates - the input sets are mostly filled with unique values.
Assignment Report
You must write a report in which you present all the code you have written (but not the code
that is already given to you), properly formatted in a fixed width font and clearly readable.
You should also include the summary output data your program produces, and if you have
used this in the provided charts.xlsx spreadsheet a screenshot of that spreadsheet and its
graphs might be useful to help you in your discussion of the performance of your code.
For each of the methods involved in each task (that is, distinctCoordsUsingList,
distinctCoordsUsingTreeSet, distinctCoordsUsingHashSet and frequencyCountUsingMap)
you should clearly explain the code and how it works, making sure you explain what the data
structure behind the code is based upon (binary search tree, hash table, array list etc.) and
the corresponding big O performance characteristics you expect it to exhibit. You should
then discuss the timings output by the program and to what extent these correspond to what
theory predicts.
You must also quote any reference materials (including course materials) that you have
used to help you complete the assignment.
5
Assessment Criteria and Weighting
LSBU marking criteria have been developed to help tutors give you clear and helpful
feedback on your work. They will be applied to your work to help you understand
what you have accomplished, how any mark given was arrived at, and how you can
improve your work in future.
For this assignment the following criteria will be applied (also see rubric following).
Marking Criteria
This assignment will be marked using an adaptation of the University’s standardised marking
criteria. It is important that you pay attention to the criteria that will be applied and address
them in the text of your report. A detailed rubric is shown on the next page, but the main
criteria are as follows:
1. Subject Knowledge (35%)
Understanding and application of subject knowledge. Contribution to subject debate.
Assessed implicitly by your written explanation of the code, and explicitly by your
ability to discuss it in terms of the theoretical content of the module and
demonstrate understanding of data structures and the significance of the algorithms
associated with them.
2. Critical Analysis (15%)
Assessed mainly by your analysis of the output of the program and its relation to
theory, but also by the rationale you give for your design approaches and their
contrast to alternative approaches in the coding.
3. Testing and Problem-Solving Skills (30%)
Design, implementation, testing and analysis of product / process / system / idea /
solution(s) to practical or theoretical questions or problems.
Assessed on the basis of the level of achievement of the coding you have developed
and documented your understanding of, bearing in mind that code that you do not
discuss in your narrative will be given very little credit (as the implication is that you
do not understand what it is and what it does if you did not discuss it). However,
your ability to explain the problems involved and the solutions they demanded, will
also be considered here, whether or not you were able to solve them.
4. Practical Competence (10%)
Skills to apply theory to practice or to test theory.
Assessed on documented evidence of your use of technical documentation (for
example tutorials and reference documentation and course materials) in working out
how to accomplish the assignment.
5. Personal and Professional Development (10%)
Management of learning through self-direction, planning and reflection
Assessed on the basis of the quality of your submitted report, including clarity of
writing, presentation, and properly addressing the assignment specification.
6
Please note the criteria weightings and general interpretation shown in bold capitals under each criteria.
Criteria Outstanding 100-80% Excellent 79-70% Very good 69-60% Good 59-50% Satisfactory 49-40% Inadequate 39-30% Very poor 29-0%
Subject Knowledge Shows sustained breadth, Shows breadth, accuracy Accurate and extensive Accurate understanding Understanding of key Some evidence of Little or no evidence of
Understanding and accuracy and detail in and detail in understanding of key of key aspects of subject. aspects of subject. superficial understanding understanding of subject.
application of subject understanding key understanding key aspects of subject. Evidence of coherent Some evidence of of subject. Inaccuracies. Inaccuracies.
knowledge. Contribution aspects of subject. aspects of subject. Evidence of coherent knowledge. coherent knowledge.
to subject debate. Contributes to subject Contributes to subject knowledge.
debate. Awareness of debate. Some awareness
CODE EXPLANATION ambiguities and of ambiguities and
35% limitations of knowledge. limitations of knowledge.
Critical Analysis Outstanding Excellent demonstration Very good demonstration Good demonstration of Demonstration of critical Trivial demonstration of Little or no critical
Analysis and demonstration of critical of critical analysis of the of critical analysis of the critical analysis of the analysis of the key critical analysis of the analysis has been
interpretation of sources, analysis of the possible possible design strategies possible design strategies possible design strategies possible design strategies possible design strategies demonstrated.
literature and/or results. design strategies that that could be used to that could be used to that could be used to that could be used to that could be used to
Structuring of could be used to meet the meet the software meet the software meet the software meet the software meet the software
issues/debates. software requirements, requirements, and requirements, and requirements, and requirements, and requirements, and
and evaluation of the evaluation of the evaluation of the evaluation of the evaluation of the evaluation of the
ANALYSIS approaches chosen. approaches chosen. approaches chosen. approaches chosen. approaches chosen. approaches chosen.
15%
Testing and Problem- Outstanding Excellent implementation Competent Implementation of all Implementation of most of Implementation of only Little or no functionality
Solving Skills implementation of all of all required software, implementation of all required software, with the required software, part of the required has been implemented.
Design, implementation, required software, with with well organised, required software, with well organised, formatted with well organised, software, with well
testing and analysis of near perfectly organised, formatted and well organised, formatted and documented source formatted and organised, formatted and
product/process/syste formatted and documented source code and documented source code, and documented documented source code, documented source code,
m/idea/solution(s) to documented source code, provided, and excellent code, and documented demonstration of runtime and documented and documented
practical or theoretical and documented comprehension demonstration of runtime behaviour, with some demonstration of runtime demonstration of runtime
questions or problems demonstration of runtime evidenced. behaviour, and very good missing/incorrect behaviour, with some behaviour, with some
behaviour, and comprehension functionality or poor missing/incorrect missing/incorrect
IMPLEMENTATION AND outstanding evidenced. quality. Good evidence of functionality or poor functionality or poor
DISCUSSION comprehension comprehension. quality. Some evidence of quality. Little evidence of
30% evidenced. comprehension. comprehension.
Practical Competence Outstanding descriptions Excellent explicit Good explicit descriptions Reasonable descriptions Basic examples of the Some trivial examples of Little or no evidence of
Skills to apply theory to of factual information, descriptions of all factual of all factual information, of most factual main factual information, factual information, factual information,
practice or to test theory programming techniques information, programming programming techniques information, programming programming techniques programming techniques programming techniques
or theoretical techniques or theoretical or theoretical techniques or theoretical or theoretical or theoretical or theoretical
USE OF REFERENCE explanations being found explanations that were explanations that were explanations that were explanations that were explanations being found explanations being found
MATERIAL in technical or theoretical found in technical or found in technical or found in technical or found in technical or in technical or theoretical in technical or theoretical
10% reference material. theoretical reference theoretical reference theoretical reference theoretical reference reference material. reference material.
material. material. material. material.
Personal and Outstanding report Excellent report Very good report Good report organisation, Satisfactory report Poor report organisation, Report does not
Professional organisation, structure, organisation, structure, organisation, structure, structure, presentation, organisation, structure, structure, presentation, constitute a serious
Development presentation, narrative presentation, narrative presentation, narrative narrative voice and presentation, narrative narrative voice and attempt at the
Management of learning voice and language. voice and language. voice and language. language. voice and language. language. assignment.
through self-direction,
planning and reflection
REPORT QUALITY
10%
7
How to get help
Resources
The main resources are the tutorial documents that you used with the
Raspberry Pi activities and the activities themselves.