0% found this document useful (0 votes)
39 views

The Ultimate OCR A Level Computer Science Dictionary (v5.0)

Uploaded by

githubtest511
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

The Ultimate OCR A Level Computer Science Dictionary (v5.0)

Uploaded by

githubtest511
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

The Ultimate OCR A

Level Computer
Science Dictionary
(v5.0) By
The Muslim CompSci
COMPONENT 1
1.1 The characteristics of contemporary processors, input, output and storage devices

Components of a computer and their uses

1.1.1 Structure and function of the processor

1) Central Processing Unit (CPU): Are general purpose processors that execute instructions in a
computer system through the fetch-decode-execute (FDE) cycle. Each consists of: An Arithmetic
Logic Unit (ALU), A Control Unit (CU), Registers and Buses.

2) Arithmetic Logic Unit (ALU): Carries out arithmetic calculations and logical decisions. Acts as a
conduit through which all I/O to computer is done and a gateway to and from the processor. The
results of its calculations are stored in the Accumulator (ACC).

3) Control Unit (CU): Decodes and manages the execution of instructions using control signals to
coordinate movement of data through the processor and other parts of the computer. Sends out
signals to coordinate how the processor works. Synchronises actions using inbuilt clock. Controls FDE
cycle and buses.

4) Register: Are memory locations in the processor that temporarily store data and control
information. They provide faster access to data than RAM for specific purposes during FDE cycle
when frequent access is needed.

5) Program Counter (PC): It controls the sequence in which the instructions are retrieved and
executed and stores the address of the next instruction to be processed. Value is then sent to the
MAR and the PC is incremented by 1 each FDE cycle after being read and is changed to address held
in CIR if the operation is a Jump.

6) Accumulator (ACC): Temporary storage of intermediate results in the ALU holds the data
currently being processed during calculations. Deals with the I/O data in the processor and is used as
a buffer. All arithmetic and logical operations use the ACC.

7) General Purpose Register (GPR): Used to temporarily store data being used rather than sending
data to and from the comparatively much slower memory.

8) Memory Address Register (MAR): Contains the address of the next instruction or of the next
location to be accessed in memory copied from PC or the address of next data item to be used
copied from operand part of instruction from CIR.

9) Memory Data Register (MDR): Contains the instructions of the memory location address specified
in the MAR when being transferred between memory and processor. Receives data currently being
used by the processor from memory location in address part of ACC. Acts as a buffer and copies
data/instructions to CIR.

10) Current Instruction Register (CIR): Holds the most recently fetched data/instructions to be
decoded and executed into opcode and operand. Instruction contents are split into 2 component
parts. Opcode is first part of instruction decoded so CU knows what to do and remainder of the
instruction content is address of data to be used with the operation or actual data if immediate
operand is used. Operand is coped to MAR if it is an address for accessing data to ACC or to MDR if it
is data. Sends address to PC for jump instruction and determines the type of addressing to be used.

11) Bus: Is a parallel group of communication channel wires able to transmit data in groups of bits
together from one register to another in the processor.

12) The Data Bus: Carries the data being transmitted from one register to another between areas of
the processor and memory. Two way because the direction carried is not specified.

13) The Address Bus: Carries the memory location address of the register where the data is being
transmitted to or from.

14) The Control Bus: Transmits control signals from the CU to allow synchronisation of signals to the
rest of the processor.

15) Fetch-Decode-Execute (FDE) Cycle: Fetch: The next instruction is fetched from the address held
by PC in main memory into the processor. PC passes this address to MAR which provides the
location sent along the address bus. PC is incremented in each cycle and the fetch signal is sent on
the control bus. The contents of the memory location are sent from memory to the processor on the
data bus and stored in the MDR. The contents of the MDR and ACC are sent to the ALU and the
result is stored back in the ACC. Decode: Load instruction from address in MAR pointed to MDR. The
instruction is copied from MDR to CIR. The instruction is decoded into opcode and operand by the
CU in the CIR. Execute: The appropriate instruction opcode is carried out on the operand by the
processor.

16) Clock Speed: The clock controls the process of executing instructions and fetching data.
Processors have increasingly large clock speeds and can be overclocked to give more cycles per
second so more instructions can be executed per second so the program takes less time to run.
Increased clock speed is limited to smaller problems. Even doubling the clock speed would only
halve the time taken.

17) Number of Cores: Each core is a distinct processing unit on the CPU. Processors can have
multiple cores to speed up smaller problems. When multitasking, different cores can run different
applications. Multiple cores can also work on the same problem.

18) Cache: This is small memory that works much faster than main memory. By anticipating the
data/instructions that are likely to be regularly accessed , the overall speed at which the processor
operates can be increased. More space for data/instructions in cache memory. RAM needs to be
accessed less frequently as accessing cache is quicker.

19) PIPELINING: Would allow one instruction to be fetched as the previous one is being decoded and
the one before that is being executed. Jump instructions do not pipeline well as they could be
followed by one of many instructions determined at execution. This means the wrong one may be
fetched or decoded so the pipeline would need to be flushed.

20) Von Neumann Architecture: The most common computer architecture. Single processor CU
manages program control. Uses FDE cycle to execute one instruction at a time in a linear sequence.
Program and data stored together in same memory format. Simple OS and easy to program but slow
processing large sets of data.
21) Harvard Architecture: Data/instructions are stored in separate memory units with separate
buses. So while data is being written to or read from the data memory, the next instruction can be
read from the instruction memory.

22) Contemporary Processor Architecture: Modern high-performance CPU chips incorporate


aspects of both architectures.

1.1.2 Types of processor

23) Complex Instruction Set Computer (CISC): More complicated processor design. Uses complex
instructions. Each instruction may take multiple machine cycles. Does not allow pipelining.
Instructions have variable format and number of bytes. Longer instruction set. Many instructions are
available. An instruction can perform complex tasks so no need to combine multiple instructions. A
task may be completed in a single machine cycle. Many addressing modes are available. Uses single
register set. Does not have GPRs so needs to constantly send data to and from memory. Requires
less RAM. Programs run more slowly due to complicated circuit. Integrated circuitry is more
expensive.

24) Reduced Instruction Set Computer (RISC): Simpler processor design. Uses simpler instructions.
Each instruction takes one machine cycle. Allows pipelining. Instructions have fixed format and
number of bytes. Smaller Instruction set. Limited number of instructions are available. An instruction
performs a simple task so complex tasks can only be performed by combining multiple instructions.
A task may take a number of machine cycles. Fewer addressing modes are available. Uses one or
more register sets. Has GPRs to reduce the need to constantly send data to and from memory.
Requires more RAM. Programs run faster due to simple instructions. Simple circuitry is cheaper

25) GRAPHICS PROCESSING UNIT (GPU): Are designed specifically for graphics so have built in
circuity and instruction sets for calculations required in common graphics operations. Tend to have
large number of cores so can run on highly parallelisable problems. Are able to perform the same
instruction on multiple pieces of data at one time (SIMD) so are suited to processing graphics (e.g.
transforming points in a polygon or shading pixels) which means it can perform transformations to
onscreen graphics quickly. Are becoming a cost efficient way of tackling problems other than
graphics processing including: subset of science/engineering problems, modelling physical systems,
data mining, audio processing, breaking passwords, machine learning.

26) Multicore Processor: Have more than one processor incorporated into a single chip.

27) Parallel Processing: A computer carries out multiple computations simultaneously to solve a
given problem. Uses multiple processors working together to perform a single job which is split into
tasks so each task may be processed by any processor so the job is completed more quickly. Allows
faster processing and speeds up arithmetic processes as multiple instructions are processed at the
same time and complex tasks are performed efficiently. Parallel processing isn9t suited to all to
problems. Most problems are only partially parallelisable. Processors are controlled by a complex OS
to adapt the sequential algorithms and ensure synchronisation. Programs may need to be written
specially or rewritten in a suitable format so writing algorithms is more challenging and the program
is more difficult to test/debug.

28) SIMD (Single Instruction Multiple Data): The same instruction operates simultaneously on
multiple data locations.

29) MIMD (Multiple Instructions Multiple Data): Different instructions operate concurrently on
different data locations.
1.1.3 Input, output and storage

30) Input Device: Are peripheral devices used to pass data into the computer and allow the user to
communicate with the computer, e.g. keyboard, mouse, scanner, microphone etc.

31) Output Device: Are peripheral devices used to report the results of processing from a computer
to the user and allow the computer to communicate with the user, e.g. printer, speaker, monitor etc.

32) Storage Device: Are peripheral devices used to permanently store data when the power is
switched off. There are three main categories of storage device: Magnetic, Flash and Optical.

33) Magnetic Storage: Uses magnetisable material and works by magnetic patterns being read off
platters that mechanically spin at high speeds. They tend to have a high capacity at a low cost. They
can be noisy due to parts moving at high speed and be susceptible to damage if moved quickly. They
also require enough space for their moving parts. Examples include: HDDs, zip drives and magnetic
tape (often used to back up servers).

34) Flash Memory: Is a solid-state technology where data is stored on memory chips. These can
have their contents erased and subsequently overwritten when an electrical charge is applied. They
have no moving parts and therefore tend to have lower power consumption and higher read/write
speeds than magnetic or optical media. They are not affected by their device moving so require less
space and operate silently. They are however prohibitively expensive. Examples include: SSDs, USB
memory sticks and Flash memory cards.

35) Optical Storage: Works by using a laser and by looking at its reflection. They tend to be cheap to
distribute and fairly resilient. Examples include: CDs, DVDs and Blu-ray discs.

36) Random Access Memory (RAM): Is where the user files, applications software and OS currently
in use are temporarily stored. The random aspect of it is that the processor can access its data
locations directly and equally as quickly as any other data location. It operates at a much faster
read/write speed than secondary storage media. It is volatile meaning it loses its contents when the
power is off. It is editable meaning data can be written to and it allows user to alter saved contents
of files in current use. It is large and reduces buffering.

37) Read-Only Memory (ROM): Is generally small memory that can be read from but not written to.
A common use for it storing the OS and BIOS bootstrap program file to start up a computer quickly.
It must not be deleted or amended unintentionally therefore is best stored on a read-only medium.
It must be present in memory and immediately available when the computer is switched on
therefore must be stored on a medium which is non-volatile meaning the contents of its memory are
not erased when the power is off. ROM memory contents cannot be altered so there is no chance of
the OS being accidentally or maliciously changed.

38) Virtual Storage: Combination of physical storage devices into a virtual single storage device.
Remote storage and software. Accessible anywhere. No need to hire specialist staff or to back up.
Security is someone else9s responsibility.

1.2 Software and software development

Types of software and the different methodologies used to develop software

1.2.1 Systems Software

39) Systems Software: Programs that control the hardware and operation of the computer system.
Acts as an interface between the processor and the user and makes the hardware useable by the
operator. Gives a platform to run and allow access to other software. Provides housekeeping
software and manages applications.

40) Operating System (OS): Is a set of software programs designed to manage the hardware of the
system. A modern OS has several purposes: Controls the hardware of the system through software
like resource management, hardware drivers, systems software, task management, scheduling,
memory management, paging, segmentation and virtual memory. Acts a platform on which
applications software can run and deals with issues that the software may have e.g. storage of files.
Provides the operator with a suitable HCI to allow communication between user and hardware e.g.
command line interface. Handles communications between computer devices using rules and
protocols to govern communication e.g. across a network. Handles translation of code through
compilers, interpreters, assemblers to translate HLL/LLL into machine code. Has many utility
software programs used to carry out housekeeping tasks on the system to maintain the hardware.
Uses job scheduling to provide fair access to processor according to set rules. Provides security to
protect access to user files e.g. through a password system. Handles interrupts based on priorities.

41) Memory Management: Organises the use of main memory by converting logical addresses to
physical addresses. Ensures no space is wasted by partitioning programs into chunks. Allows
programs to share memory to protect processes9 data from each other. Ensures programs can9t
access each other9s memory unless legitimately required to. Allocates memory to allow separate
processes to run at the same time and reallocates memory when necessary. Allows programs larger
than main memory to run. Provides security to protect the OS.

42) Paging: Partitions memory into fixed sized physical divisions made to fit sections of memory.

43) Segmentation: Partitions memory into variable sized logical divisions which hold complete
sections of programs. Both: Are assigned to memory when needed to allow programs to run despite
insufficient memory. Are stored on a backing store disk to swap parts of programs used for virtual
memory. Allow programs to be stored in memory non-contiguously. May cause disk threshing when
more time spent swapping pages than processing so computer may 8hang9.

44) Virtual Memory: When the memory available is insufficient, an allocated area of a secondary
storage device is used to allow large programs to run. Uses a backing store as additional memory for
temporary storage. Swaps pages using paging between RAM and backing store to make space in
RAM for pages needed. Holds part of the program not currently in use. High rate of disk access may
cause computer to 8hang9 when the disk is relatively slow and more time is spent transferring pages
between memory and disk than processing. This problem is called disk threshing.

45) Interrupt: Different devices, processes and tasks may need to alert the processor indicating they
need attention at various times. They obtain processor time via generating a signal or message as an
indicator to the processor that they need to be serviced which stops the processor and causes a
break in the execution of the current routine. Each interrupt has a priority to decide between the
interrupt and current task so important tasks take precedence in being processed if 2 or more occur
together. An interrupt can only interrupt a lower priority task to avoid delays and loss of data and to
ensure the most urgent task is performed first. An interrupt starts when the current FDE cycle has
finished to ensure the most efficient use of the processor.

46) Interrupt Service Routine (ISR): Check the Interrupt register by comparing the priority of the
incoming interrupt with the current task. If it is of a lower or equal priority to the current task then
the current task continues. If it is of a higher priority the CPU completes the current FDE cycle. The
contents of registers are copied to a LIFO stack stored in memory. The location of the appropriate
ISR is loaded by loading the relevant value into the PC. When the ISR is complete: flags are reset to
inactive state, further interrupts are checked and serviced if necessary, previous state contents are
popped from the stack and are loaded back into the registers in order to resume processing.

47) Scheduling: A scheduler is a program that manages the amount of time different processes have
in the CPU. It has several purposes: Maximise number of interactive users receiving reasonably fast
response times with no apparent delay. Maximise number of jobs processed in the least possible
time. Obtain the most efficient use of processor time and utilise resources dependent upon
priorities. Ensure all jobs are processed fairly by changing priorities where necessary so long jobs do
not monopolise the processor. Prevent process starvation from applications in deadlock failing to
run.

48) Round Robin: Pre-emptive scheduler allocates each user or process a very small time slice in a
sequence. If it hasn9t finished by the of the time slice, the system moves to the next user in turn and
the job moves to the back of the queue. If the next user needs processor they are given a time slice.
This is repeated until all users are serviced. The order may depend on the users9 different priorities
but the users are unaware of any apparent delays.

49) First Come First Served: Jobs are processed in order of arrival so the first process to arrive is
dealt with by the CPU until it is finished; meanwhile, any other processes that come along are
queued up for their turn. This may not be efficient as once a job starts it prevents other jobs from
being processed. A job using a slow resource e.g. a printer wastes processor time.

50) Multi-Level Feedback Queues: This uses a number of queues. Each of these queues has a
different priority. The algorithm can move jobs between these queues depending on the job9s
behaviour.

51) Shortest Job First: The scheduler estimates the length of each job and picks the one that will
take the shortest time and runs it until it finishes.

52) Shortest Remaining Time: Similar to shortest job first but If a job is added with a shorter
remaining time, the scheduler is switched to that one.

53) Distributed: Allows the multiple computers, resources or processors on a network to work
together on the same problem and be treated as a single system. Shares processing and the data
between different systems on a network in order to reduce bottlenecks.

54) Embedded: Built into a device, often with limited resources.

55) Multi-Tasking: A round robin system which allows more than one task or software program to
run apparently simultaneously using separate windows for each task to be opened at the same time.
Each task is given a slice of processor time before going on to next. The user can switch between
different programs available in different windows e.g. playing music while typing essay.

56) Multi-User: One computer with many terminals allows more than one user to access the
computer9s resources at the same time. Each terminal is given a very small time slice (c.1/100 of a
second) in turn. Uses flags and priorities or privileges. Security provision is essential to user rights so
the data is separated. Used in e.g. supermarket checkout system, online gaming and in a mainframe
serving many terminals.

57) Real Time: The data is processed immediately and a response is given within a guaranteed time
frame.
58) Basic Input/Output System (BIOS): When a computer is first switched on, it looks to the BIOS to
get it up and running, and so the processor9s PC points to the BIOS9s memory. The BIOS will usually
first check that the computer is functional, memory is installed and accessible and the processor is
working. It is usually stored on flash memory so that it can be updated; this also allows settings such
as boot order of disks to be changed and saved by the user.

59) Device Driver: An OS is expected to communicate with a wide variety of devices, each with
different models and manufacturers. It would be impossible for the makers of OSs to program them
to handle all existing and future devices. This is why we need device drivers. A device driver is a
piece of software, usually supplied with a peripheral device, that contains instructions to enable the
peripheral and OS to communicate and configure hardware.

60) Virtual Machine: Is a theoretical or generalised computer which provides an environment in


which a translator is available on which programs can run. It has limited, if any, access to some low
level features e.g. access to the GPU which can optimise programs. Used to run an OS inside another
on a software implementation of a machine e.g. when testing programs9 compatibility until the
physical machine is ready. Uses an interpreter to run intermediate code. This intermediate code can
then be run off any computer with a virtual machine but tends to be slower than a compiler.

61) Intermediate Code: Partly translated simplified code between high level and machine code that
is produced by a compiler so the program is error free. Is platform-independent so can run on a
variety of computer devices and virtual machines using an interpreter improving portability between
machines. Same intermediate code can be obtained from different HLLs to allow sections of program
code to be written in different languages by different programmers suitable for specific tasks. The
program runs more slowly than executable code as it needs to be translated each time it is run by
additional software. Protects the source code from being copied to help protect intellectual
property.

1.2.2 Applications Generation

62) Software: Is the programs/set of instructions/code that run on a computer system to make the
hardware work. Types of software include applications and utilities.

63) Applications Software: This is programs that allow the user or hardware to carry out tasks or
provide useful outputs which would have had to be done without a computer. Is a complete
collection of compatible pieces of software that contains electronic or hard copy user
documentation describing the software to the user. Examples include: Word processors e.g.
Microsoft Word, Spreadsheet packages e.g. Microsoft Excel, Presentation software e.g. Microsoft
PowerPoint, Desktop publishing software e.g. Microsoft Publisher, Image editors e.g. Microsoft
Photo Editor, Web browsers e.g. Microsoft Edge.

64) Utility: Is a relatively small piece of systems software with one specific purpose, usually related
to the maintenance of the system by carrying out housekeeping tasks. Examples include: Anti-virus
programs e.g. Windows Defender, Disk defragmentation e.g. Windows Disk Defragmenter,
Compression e.g. Windows Compression Software, File managers e.g. Windows File Explorer, Backup
utilities e.g. Windows Backup Utility.

65) Open Source: Source code is freely available for others to examine or recompile. This allows
other to make amended versions of the program and contribute to the program9s development.
There isn9t a commercial organisation behind the software so there won9t be a helpline or regular
updates. Examples include Linux, Libre Office, Firefox.
66) Closed Source: Propriety software is sold in the form of a license to use it which will restrict
users on how the software can be used. The software company or developer holds the copyright so
users will not have access to the source code and will not be allowed to modify the package and sell
it to other people. There will be support available from the company such as regular updates,
technical support lines, training courses and a large user base. Examples include Mac OS, iWork,
Safari.

67) Translator: Is used to convert code from one language to another so from HLL or LLL source code
to object, intermediate, executable or machine code and detects errors in source code.

68) Interpreter: Interprets and runs HLL source code which is machine independent. Translates one
statement then runs it before translation of the next line. Reports one error at a time as they are
found then stops to indicate position of error. The interpreter must be present each time the
program is run so the program runs more slowly due to translation. Used by virtual machines and
during program development. The Source code is visible meaning it can easily be copied and
amended but can be obfuscated.

69) Compiler: Converts HLL source code to machine code. Translates the whole program as a unit
and creates an executable or intermediate language program when the program is completed. Gives
a list of errors at the end of compilation at once but some reported errors may be spurious.
Compiled code is not human readable helping to preserve intellectual property and protect the
program from malicious use. Compiled code is machine dependent and architecture specific so
multiple versions of the code will need to be maintained for different architectures. Optimisation
improves program speed and size so compiled code runs more quickly. The compiler is no longer
needed once the executable code is produced. Easy to get access to lower level features such as GPU
access. Produces intermediate code for virtual machines.

70) Assembler: A program that uses low level source code to translate assembly code into machine
or object code. One assembly language instruction is converted into one machine code instruction.
Reserves storage for instructions and data. Replaces mnemonic opcodes by machine codes and
symbolic addresses by numeric addresses. Creates a symbol table to match labels to addresses.
Checks syntax and gives list of errors at the end with error diagnostics.

71) LEXICAL ANALYSIS: The source code program is used as the input. Series of tokens are created
from the individual symbols and reserved words/keywords in the program. Each token is a fixed
length string of binary digits. Variable names are loaded into a look-up/symbol table which stores
information about variables and subroutines. Redundant characters e.g. white spaces are removed.
Comments are removed. Error diagnostics are given. Prepares code for syntax analysis.

72) SYNTAX ANALYSIS: Receives and accepts the output from lexical analysis. The compiler checks
the statements, arithmetic expressions and tokens in the program are syntactically correct against
the grammar and rules about the structure of the programming language e.g. matching brackets. An
abstract syntax tree is built. Errors are reported as a list at the end of compilation. Error diagnostics
are given. Further detail is added to the symbol table e.g. data type, scope and address. If there are
no errors, it passes the code to code generation.

73) CODE GENERATION: Is the last phase of compilation that occurs after syntax analysis. Abstract
syntax tree is converted to object code. Produces machine code program/executable
code/intermediate code which is equivalent to the source program. Variables and constants are
given addresses. Relative addresses are calculated.
74) OPTIMISATION: Occurs during code generation. Object code is checked and made as efficient as
possible. Increases processing speed. Reduces number of instructions. Programmer can choose
between speed and size.

75) LINKER: Combines the compiled program code with the compiled library routine code all into
one executable program.

76) LOADER: Is the part of the OS that loads the executable program and associated libraries into
memory and handles addresses before the program is run.

77) LIBRARY: Standard pieces of software which often perform common tasks such as sorting and
searching. Routines are compiled and fit into modularisation of algorithms. They are pre-tested so
are relatively error free. They are pre-written so are ready to use and already available for
programmer to use with a new program which saves work and time. They may be used multiple
times to reduce repeated code. They may allow programmer to use code which has been written in
a different source language. They are written by an expert so allows programmer to use others9
expertise.

1.2.3 Software Development

78) Feasibility Study: Purpose is to carry out initial enquiries to see if there are any reasons why the
new system may not be acceptable before starting to produce it and determine whether the
problem can be solved. Without it, time and money may be spent on a project that is likely to fail.
The plan may be revised if the study highlights problems. Analysts consider parameters like:
Technical feasibility – Is there hardware/software available to implement the solution? Economic
feasibility/cost benefit analysis – Is the proposed solution possible to run economically? Social
feasibility – Is the effect on the humans involved too extreme to be socially acceptable/
environmentally sound? Effect on company9s practices and workforce – Is there enough operational
skill in the workforce to be capable of running the new system? What is the expected effect on the
customer? - If customer not impressed then there may not be a point. Legal/ethical feasibility – Can
the proposed system solve the problem within the law? Time available – Is the time scale acceptable
for the proposed system to be possible? Consideration of budget available – Is the budget sufficient
to cover the costs expected?

79) Requirements Specification: The specification document is developed between the client and
software developers that creates an understanding of the problem and presents solutions. It
unambiguously states everything the new system is expected to do and contains: Input
requirements, Output requirements, Processing requirements, Clients agreement to requirements,
Hardware requirements and software requirements.

80) Black Box Testing: Tests different possible suitable predefined sets of inputs to see if they
produce the expected output according to the design without considering how the program works
so you need to test all possible types of situations without understanding or looking at the code.

81) White Box Testing: Understands the complete structure and logic of the program. It uses the
source code to test the actual steps of the algorithms to make sure all parts work as intended so you
need to check every possible condition statement and path through the algorithm with dry runs and
trace tables.

82) Alpha Testing: Is done by the programmers, developers and employees within the software
company playing the role of the user during development to find bugs in the program. May use
emulators rather than actual hardware or software.
83) Beta Testing: Uses the beta version which is the pre-released test version of the program. It is
nearly complete and already tested by the programmers involved in the production. The program is
given to a group of third party users to use as intended and test under normal operating conditions.
The aim is to report any errors or bugs in the program which the programmer overlooked such as
functions which do not work and incompatibility issues with other software or hardware. They may
also report on desirable improvements. The programmer tries to replicate and then address the
errors found and may release updates, fixes or workarounds to the beta testers. Consequently, the
final version will be more robust.

84) User Documentation: Gives instructions to ensure software users can successfully use the
system to produce the desired results. It may contain information such as: descriptions of required
I/O procedures, sample outputs from given inputs, using processing tools, instructions to operate
the system, backing up and archiving procedures, file searching and maintenance, simple
maintenance procedures e.g. how to replace external storage device, error messages and their
meaning, troubleshooting, FAQs, help available from section, contacts for further assistance,
licensing agreement, warranties, explanation of the UI, meaning of icons, how to change settings,
required hardware specifications and system set up procedures, instructions for installation,
glossary, index and contents pages.

85) Technical Documentation: Describes and explains how the system works. This is useful for a
technician who may need to maintain and further develop or alter the system in the future. It may
contain information such as: DFDs showing the flow of data through the system, System flow charts
showing how parts of the system interrelate, Flowcharts showing the operations involved in the
algorithm, ERDs showing how data tables relate to each other.

86) Waterfall Lifecycle: Consists of a series of linear stages that are all presented in order. The stages
need to be completed to produce a working system. The results from each stage in the life cycle are
used to feed information and inform the work on the next stage. If necessary at any stage, it is
possible to return to re-evaluate one or more previous stages to make amendments by either
collecting more information or data or to check on data that has been collected in order to improve
the solution. After returning, all the intervening steps must be revisited. The stages include:
Feasibility Study, Investigation/Requirements Elicitation, Analysis, Design, Implementation/Coding,
Testing, Installation, Documentation, Evaluation, Maintenance. Advantages: Tends to suit large scale
projects with static/stable base requirements. Establishes requirements in early stages and
subsequent stages focus on these. Focuses on the end user at the start and then they may be
consulted at different points throughout the project. The development phase focuses on code that
meets the requirements/design. Ideal for supporting inexperienced project teams and where
requirements are defined. Orderly sequence of development ensures quality documentation with
reliability and maintainability of the developed software. Progress of system development is easily
measurable. Project progresses forward, with only slight movement backward. Disadvantages: If a
change does occur in the requirements the lifecycle cannot respond easily, often at the cost of time
and money. It can be inflexible and limits changing requirements. Dependent upon clear definition
of requirements as there is little 8splash-back9. Produces excessive documentation, and keeping it
updated as the project progresses is time-consuming. Missing system components are often
discovered during design and coding. System performance cannot by tested until the system is
almost fully coded.

87) Agile Development Methodology: These are a group of methods designed to cope with
changing requirements through producing the software in an iterative manner; i.e. it is produced in
versions, each building on the previous and increasing the requirements it meets. This means that if,
on seeing a version, the user realises they haven9t fully considered a requirement, they can have it
added in a future iteration. An example of an agile development methodology is extreme
programming.

88) Extreme Programming (XP): Takes on an agile approach and is iterative in nature (the program is
coded, tested and improved repeatedly) with iterations typically a week long. A representative of
the customer becomes part of the team. They help decide the 8user stories9 (XP9s equivalent of
requirements), decide what tests will be used to ensure they have been correctly implemented and
answer any questions about any problem areas the programmers might have. Advantages: New
requirements can be adopted throughout. An end user is integral throughout. Pair programming and
strict programming standards are used to ensure the code in each iteration is well-tested, robust and
of good enough quality to be used in the final product. Code is created quickly and modules become
available for use by the client as they are completed. Disadvantages: The client has to ensure that
they are represented on the development team to accept completed code and to discuss any
potential changes. Emphasis on coding rather than design results in a lack of documentation, making
it unsuitable for larger projects.

89) Spiral Model: Progresses by evaluating and dealing with risk. The analyst begins by collecting
data followed by each of the other stages leading to evaluation, which will lead to a return to data
collection to modify the results. The different stages are refined each time the spiral is worked
through, iterating until the project is complete. Is normally used for mission-critical projects, as well
as projects where risks are involved and also in games development. Advantages: Large amount of
risk analysis ensures the riskiest parts of the project are identified and dealt with first so issues are
addressed early in project development. A software prototype is created early in the life cycle and
refined with each spiral iteration. Disadvantages: Highly skilled development team needed to
perform risk analysis. Development costs can be high due to number of prototypes being created
and increased customer collaboration.

90) Rapid Application Development (RAD): Is a method for designing software where a mock-up or
prototype system of the program is designed and produced with reduced functionality to a set
deadline. It is then tested and evaluated to obtain feedback from the end users to suggest
improvements. These results are used to inform and refine the design of the next prototype. The
process cycle is repeated with each iteration improving the program until the final, fully functional
version of the product is produced. Used where the requirements of the system are not well defined
and the development team is authorised to make design decisions without the need for detailed
consultation with their senior managers. Advantages: End user can see a working prototype early in
the project. End user is more involved in the design and can change the requirements as the product
becomes clearer to influence the direction the program is taking. Overall development time is
quicker than alternative methods reducing development costs. Concentration on the essential
elements needed for the user with an emphasis on fast completion. Disadvantages: The emphasis on
speed of development may impact the overall system quality. Potential for inconsistent designs and
a lack of attention to detail in respect of administration and documentation. Not suitable for safety-
critical systems.

1.2.4 Types of Programming Language

91) PROCEDURAL LANGUAGE: High-level, 3rd generation, imperative languages. Use sequence,
selection and iteration. Program gives a series of instructions in a logical order line by line to state
what to do and how to do it. Program statements are in blocks called procedures and functions.
Break the solution down into subroutines which are rebuilt and combined to form a program.
Statements are in a specific order i.e. sets tasks to be completed in a specific way. Logic of program
is given as a series of procedure calls. Examples include: VB.NET and Python.

92) Assembly Code: Is a machine-oriented language related closely to the specific design of the
computer being programmed. Is a LLL but higher level than machine code i.e. each instruction is
generally translated into 1 machine code instruction. Uses descriptive names for data stores. Uses
mnemonics for instructions. Uses labels for addresses to allow selection. May use macros. Is
translated by an assembler. Easier to write than machine code, but more difficult than HLL.

93) Little Man Computer (LMC): Is a fictional processor designed to illustrate the principles of how
processors and assembly code work.

94) IMMEDIATE ADDRESSING: Used in assembly language. Uses data in address field as a constant.
Data in the operand is the value to be used by the operator e.g. ADD 45 adds the value 45 to the
value in the ACC.

95) DIRECT ADDRESSING: The simplest and most common method of addressing. Uses the data in
the address field without modification. This means the operand represents the memory location of
the data to be used by the operator. The instruction gives the address to be used e.g. in ADD 23, use
the number stored in address 23 for the instruction. The number of memory locations available that
can be addressed is limited by the size of the address field. Code is not relocatable so uses fixed
memory locations.

96) INDIRECT ADDRESSING: Uses the address field as a vector or pointer to the address to be used.
In other words, the operand is the address of the data to be used by the operator. Used to access
library routines. Increases the size of the address that can be used allowing a wider range of memory
locations to be accessed. E.g. in ADD 23, if address 23 stores 45, address 45 holds the number to be
used.

97) INDEXED ADDRESSING: Uses an Index Register (IR) and an absolute address to calculate
addresses to be used. Modifying the address given by adding the number from the IR to the address
in instruction. Allows efficient access to a range of memory locations by incrementing the value in
the IR e.g. used to access an array

98) OBJECT-ORIENTED PROGRAMMING (OOP): Program components are split into small units called
objects which are used by and interact with other objects to build a complex system. Most programs
nowadays are built using OOP at least to some extent. Examples of OOP languages include: Java, C++
and C#.

99) CLASS: A template defining methods and attributes used to construct a set of objects that have
state and behaviour.

100) OBJECT: These are program building blocks. They are self-contained instances of classes based
on real-world entities which contain attributes and methods.

101) METHOD: Is a program code subroutine that forms the actions an object can carry out.

102) ATTRIBUTE: Is a data value, stored in a variable associated with an object.

103) CONSTRUCTOR: The method used to describe how an object is created.

104) INHERITANCE: Is when a derived class inherits all the methods and attributes of its parent
class/superclass. The inheriting class may override some of these methods and attributes and may
also have additional extra methods and attributes of its own. This means that one class can be coded
and used as the base for similar objects which will save time.

105) ENCAPSULATION: Is the process of data hiding by keeping an object9s attributes private so their
values can only be directly accessed and changed via public methods defined for the class and not
from outside the class. This means that objects only interact in the way intended and prevents
indiscriminate, unexpected changes to attributes as this could have unforeseen consequences.
Maintains data integrity.

106) POLYMORPHISM: Means that objects of different types can be treated in the same way. This
allows the same method to be applied to objects of different classes. An example might be an array
with objects of different classes having the same method applied to all of them. The code written is
able to handle different objects in the same way which reduces the volume of code produced.

1.3 Exchanging data

How data is exchanged between different systems

1.3.1 Compression, Encryption and Hashing

107) Compression: Is necessary to reduce file sizes and the time taken for data transmission. The
following needs to be taken into account: The available bandwidth, the expected processing power
of the user9s computer and the expected storage requirements.

108) Lossy Compression: An algorithm that removes data from the file to make file storage space or
size smaller but the accuracy with which it represents the data is reduced and the information lost in
the process is not recoverable. Assumes there is enough data remaining to be acceptable.
Commonly used for sound and image files: JPEG, MPEG and MP3.

109) Lossless Compression: An algorithm is used to retain all the information in a file while reducing
its size. Files are stored and transmitted intact so the original can be reconstructed from this data.
Commonly used for the following files: program code, ZIP, GIF and PNG.

110) RUN LENGTH ENCODING (RLE): A dictionary is used to store items such as pixels, words or
other groupings of bits. Repeated occurrences are stored in a dictionary or table, plus their number
of occurrences, e.g. 100 blue pixels can be stored as B100. Used in TIFF and BMP files.

111) DICTIONARY CODING: The compression algorithm searches through the text to find suitable
entries in its own dictionary or it may use a known dictionary and translates the message
accordingly. Works by substituting entries with a unique code. Used in ZIP, GIF and PNG files

112) ENCRYPTION KEYS: Are long random numbers which have the information needed to encrypt
and decrypt messages. The public key is publicly available to all. The private key is kept confidential
to its owner.

113) SYMMETRIC ENCRYPTION: The same key is used to encrypt and to decrypt. Requires both
parties to have a copy of the key. Can9t be transmitted over the Internet or an eavesdropper
monitoring the message may see it.

114) ASYMMETRIC ENCRYPTION: Different keys are used to encrypt and to decrypt – this is more
secure. The public key encrypts the data. The private key decrypts the data.

115) HASHING ALGORITHMS: Used to transform data, e.g. network passwords. Stored in an
abbreviates form e.g. 123456 becomes 456. Difficult to regenerate the original from the hash value.
Easy to check – the login attempt is hashed again. Vulnerable to brute-force attacks. Low chance of
collision (i.e. different inputs giving the same output) to reduce the risk of different files being
marked as the same. Provides a smaller output than input so quicker to calculate and compare
hashes than original data.

1.3.2 Databases

116) Database (DB): Are structured and persistent stores of data for ease of processing i.e. on
secondary storage, non-volatile. Allow data to be: retrieved quickly, updated easily and filtered for
different views.

117) Relational Database (RDB): Based on linked tables (relation). Each table is based on an entity
and has rows and columns. Each row (tuple) in a table is equivalent to a record and is constructed in
the same way. Each column (attribute) is equivalent to a filed and must have just one data type. One
column or combination of columns must be the PK. Reduces and avoids data duplication and data
redundancy to save storage space. Improves data consistency and data integrity. Easier to change
data and data format. Data can be added more easily. Improves levels of data security so easier to
control access to data.

118) Flat-File Database (FFDB): These are simple data structure tables that are easy to maintain as
only a limited amount of data is stored. They are of limited use because they may have redundant
and inconsistent data. No specialist knowledge is needed to operate. They are harder to update.
Data format is difficult to change.

119) Primary Key (PK): Is a unique identifier in a table used to define each record.

120) Foreign Key (FK): PK in one table is used as an attribute or FK in another to provide links or
relationships between tables. Represents a 1-m (one to many) relationship where the FK is at the
<many= end of the relationship to avoid data duplication. This allows relevant data to be extracted
from different tables.

121) Secondary Key (SK): An attribute that allows a group of records in a table to be sorted and
searched differently from the PK and data to be accessed in a different order.

122) Entity Relationship Diagram (ERD): Necessary when planning a RDB. Uses a diagram to show
how data tables relate to each other. Helpful in reducing redundancy.

123) NORMALISATION: A formal, methodical process to design data tables optimally. Goes through
distinct stages to lead to at least 3NF. Resolves m-m (many to many) relationships. Minimises
repetition to reduce data redundancy. Ensures all attributes in a table depend on one another to
avoid the need to update multiple data entries when changing a single attribute to reduce the
chances of mistakes.

124) 1NF (FIRST NORMAL FORM): Separates out the multiple items/sets of data in each row.

125) 2NF (SECOND NORMAL FORM): Removes data that occurs in multiple rows and puts these data
items in a new table. Creates relationships/links between the tables as necessary by repeated fields.

126) 3NF (THIRD NORMAL FORM): Removes non-key dependencies (i.e. transitive relationships) to
their own linked table so every non-key attribute/field depends on the key, the whole key and
nothing but the key!
127) INDEXING: The PK is normally indexed for quick access. The SK is an alternative index allowing
for faster searches based on different attributes. The index takes up extra space in the database.
When a data table is changed, the indexes have to be rebuilt.

128) Serial File: Are relatively short and simple files. Data records are stored chronologically i.e. in
the order in which they are entered. New data is always appended to the existing records at the end
of the file. To access a record, you to search from the first item and read each preceding item. Easy
to implement. Adding new records is easy. Searching is easy but slow.

129) Sequential File: Are serial files where the data in the file is ordered logically according to a key
field in the record.

130) Indexed Sequential File: Records are sorted according to a PK. A separate index is kept that
allows groups or blocks of records to be accessed directly and quickly. New records need to be
inserted in the correct position and the index has to be maintained and updated to be kept in sync
with the data. Is more difficult the manage but accessing individual files is much faster. More space
efficient. More suited to large files.

131) Database Management System (DBMS): Is software that creates, maintains and handles the
complexities of managing a database. May provide a UI. May use SQL to communicate with other
programs. Provides different views of the data for different users. Provides security features. Finds,
adds and updates data. Maintains indexes. Enforces referential integrity and data integrity rules.
Manages access rights. Provides the means to create the database structures: queries, views, tables,
interfaces and outputs.

132) Query: Isolate and display a subset of data. QBE: query by example.

133) STRUCTURED QUERY LANGUAGE (SQL): Is a declarative database language that allows the
creation, interrogation and alteration of a database.

134) REFERENTIAL INTEGRITY: Transactions should maintain referential integrity. This means
keeping a database in a consistent state so changes to data in one table must take into account data
in linked tables, e.g. you cannot delete data that is linked to existing data in another table. It is often
enforced by DBMS.

135) TRANSACTION: Are changes in the state of a database: Addition, Deletion and Alteration of
data. Transactions must conform to the ACID rules:

136) ATOMICITY: They should either succeed or fail but never partially succeed.

137) CONSISTENCY: The transaction should only change the database according to the rules of the
database.

138) ISOLATION: Each transaction shouldn9t affect or overwrite other transactions concurrently
being processed.

139) DURABILITY: Once a transaction has been started it must remain no matter what happens.

140) RECORD LOCKING: Is the technique of preventing simultaneous access to objects in a database
in order to prevent updates from being lost or inconsistencies in the data arising. A record is locked
whenever a user retrieves it for editing or updating. Anyone else attempting to retrieve the same
record is denied access until the transaction is completed or cancelled, e.g. if one transaction is
amending a record, no other transaction can until the first transaction is complete.
141) DATA REDUNDANCY: Is unnecessary repetition of data that leads to inconsistencies. Data
should have redundancy so if part of a database is lost it should be recoverable from elsewhere.
Redundancy can be provided by RAID setup or mirroring servers.

1.3.3 Networks

142) Private Network: Advantages: security, control of access, confidence of availability.


Disadvantages: Need specialist staff, organised backups and security. More organisations and
individuals are using the 8the cloud9.

143) Star Topology: Most networks have a star layout. Characterised by a separate physical link from
each node to a switch or hub. They are resilient.

144) Bus Topology: Nodes are attached to a single backbone, so are vulnerable to breakages. Are
prone to data collisions. They are uncommon these days.

145) Ring Topology: Each node connects to exactly 2 other nodes. Data frames are sent in one
direction to minimise collisions. Are easily disrupted.

146) Protocol: A set of rules to control and govern data transmission between devices on a network.

147) Transmission Control Protocol/Internet Protocol (TCP/IP): It is a suite of protocols developed


for the internet that covers: data format, addressing, routing and receiving. It uses 4 layers and is
equivalent to the middle layers of the OSI model. 7 – Application: capturing and delivering data,
packaging. 4 – Transport: establishment and termination of connections via routers. 3 – Internet:
provides means of transmission across different types of network. Concerned with IP addressing and
direction of datagrams. 2 – Link: Concerned with passing datagrams to physical devices and media.
As with the OSI model, the top layers are close to the user; the bottom layers are close to physical
transmission.

148) Open Systems Interconnection (OSI) Model: An openly available (non-proprietary) network
model. 7 layers are provided in the OSI model: 7 – Application: collecting and delivering data in the
real world. 6 – Presentation: data conversions. 5 – Session: manages connections. 4 – Transport:
packetizing and checking. 3 – Network: transmission of packets, routing. 2 – Data Link: access
control, error detection and correction. 1 – Physical: network devices and media.

149) Domain Name System (DNS): Is a system for naming resources on a network. A hierarchical
system provides human readable equivalents to IP addresses such as bbc.co.uk. Domain names are
held on servers and are passed through DNS servers in an attempt to map the domain to an IP
address. If the DNS server can9t resolve, it passes the request recursively to another DNS server. The
DNS Server sends the IP address to the browser so it can retrieve the website from the server on
which it is hosted.

150) Layering: Is a form of abstraction. Division of a complex system into its component parts.
Allows work to be carried out piecemeal. Each layer communicates only with adjacent layers. Allows
efficient problem solving – focusing on one part of the problem in isolation. Basic layers: 1 –
Physical: the hardware that provides the connections. 3 – Network: concerned with routes from
sender to recipient. 7 – Application: concerned with collecting and delivering data to and from users.
These layers are usually further subdivided in real networks.

151) Media Access Control (MAC) Address: Are unique identifiers associated with a network
interface. They: provide addressing capability in a network. Are usually assigned by the
manufacturer. Are 48-bit (6-byte) addresses, i.e. 6 octets, e.g. DC-85-DE-4B-FB-3A.
152) Internet Protocol (IP) Addressing: Each device on an IP network has a numerical address made
of 4 numbers each between 0 and 255 i.e. 32 hexadecimal digits that uniquely identifies a device on
a network. It is a logical identifier so can change on a physical device. Is used to route messages.

153) Static Addressing: IP addresses are assigned permanently to a device.

154) Dynamic Host Configuration Protocol (DHCP): IP addresses are automatically assigned as
needed.

155) Subnet: Have their own IP addresses to conserve addresses.

156) Personal Area Network (PAN): Linked personal devices

157) Storage Area Network (SAN): Block-level storage. Devices are consolidated, so are not visible
to users

158) Local Area Network (LAN): A group of computers or shared devices linked together over a small
geographical area on one site. The infrastructure and connections are usually owned by the network
owner. Allows communication using hardwired or wireless over a short range from a central point. Is
more secure. Requires no extra communication devices.

159) Wide Area Network (WAN): Covers a large geographical area. Computers are over remote
distances. Typically each computer will have its own peripherals rather than sharing them. Tends to
use different forms of communication media links supplied by an external third party, e.g.
telephone lines, microwave, satellite. Data is subject to interception and attack. Requires modem.

160) Data Packet: File is divided or split into equal groups of bits of standard size made up of control
bits and data. Have a structure defined by the protocol being used. Each has an identity on the label
attached stating the following. There are 3 basic parts: Header: Sender9s/ transmitting IP address,
Receiver9s/destination IP address, Protocol, Packet number and order, i.e. the place of the packet in
the complete message. Payload: The data file to be transmitted. Trailer: End of packet marker and
error correcting code.

161) Packet Switching: Central to the success of the internet. Is connectionless mode, i.e. no
permanent connection is established for the message. Has no established route or pre-set path. At
each node on the network the destination address is read and the best available route is found from
source to destination. This means individual packets being sent on to the network on the most
convenient or avoidable paths. Transmission is safer from interception and avoids message failure if
a route is disrupted because it is impossible to intercept all packets as they use different routes so is
more secure. Packets arrive out of order at the receiving end and need to be reordered and message
is reassembled to recreate the data at its destination. Only as fast as the slowest packet. Maximises
use of the available infrastructure very efficiently as each channel is only used for a short time so
does not tie up a proportion of the network. Loss of part of the communication will not be fatal
because if there is an error then only a small, identifiable, part of the data or one packet is affected
which can be retransmitted easily if the message does not arrive safely. Error checking promotes
successful transmission.

162) Circuit Switching: There are three phases: connection establishment, data transmission and
connection termination. Physically connects devices together - so-called 8connection mode9. Devices
remain connected for the duration of data transmission. Establishes a route before transmission
between the two computers for the duration of the message. Ties up large areas of network so no
other data can use any part of the circuit until the transmission is complete. Sends all packets on the
same reserved route down the circuit in order. Message can be interpreted if the route is tapped
into. Packets remain in correct order but must be reassembled at the destination.

163) AUTHENTICATION: Is typically a valid user ID and password.

164) FIREWALL: Are anti-hacking applications (hardware and/or software) that sit between the
system and external access to control traffic signals and into and out of a network, ensure
communications are restricted and attempt to limit or prevent access to the system by particular
unauthorised sources, systems, users or external machines.

165) PROXY: Is a computer placed between a network and a remote resource that intercepts traffic
and restricts the users allowed access to individual machines on network from in the internet to
isolate the network from the outside world.

166) ENCRYPTION: Widely used when transmitting information on the internet and in networks
because data can be intercepted. Protects files so that if unauthorised access is gained the data is
unintelligible. It is Important in VPNs because of the number of users sharing the physical network.
Features such as digital signatures or certificates use encrypted messages to confirm the identity of
the sender. Protocol based protection like Secure Socket Layer (SSL) allows for an encrypted link
between computer systems to stop third party access.

167) ILLEGAL HACKER: Attack and exploit weaknesses in a system to access data such as usernames
and passwords, personal information, etc.

168) DENIAL OF SERVICE (DOS): Attacks send requests from multiple users, or bots, to disrupt the
service for political reasons or simply to blackmail the service owner/provider.

169) MODEM: Alters the signal between A and D to act as a gateway to the WAN. Works at the
physical and data link layers.

170) ROUTER: A device that forwards data packets between two networks. Works as the network
layer.

171) NETWORK INTERFACE CONTROLLER (NIC): Generates and receives electrical signals. Works at
the physical and data link layers.

172) WIRELESS ACCESS POINT: Is usually connected to a router. Works at the data link layer.

173) Client-Server: The client computer requests services from the server. The server is a high-end
computer. The server provides services such as file and print, web, email, data processing and
storage. The client code is less complex so can be implemented on multiple platforms. The servers
can be upgraded to fix security problems and provide more features.

174) Peer-to-Peer (P2P): All computers are of equal status. Computers can act as a client or server,
or both. Useful on the internet so traffic can avoid servers. Is private so there isn9t a reliance on the
company9s server and its connection to the Internet. This means it hasn9t got to invest in lots of
expensive hardware and bandwidth so is cheaper. The system is likely to be more fault tolerant.

1.3.4 Web Technologies

175) World Wide Web (WWW): The collection of billions of web pages. Pages written in HTML.
Pages have tags to indicate how text is to be handled. Pages have hyperlinks. Pages have various
assets: images, forms, videos and applets.
176) Browser: Software that renders/displays HTML pages. Find web resources by accepting URLs
and following links. Also useful to find resources on private networks. Common well-known
browsers are: Firefox, Microsoft Edge, Google Chrome, Opera and Safari.

177) Hypertext Mark-up Language (HTML): The standard for making web pages: Text files. Tags to
attach to items – affect how they are rendered.

178) Cascading Style Sheets (CSS): Determine how tags affect objects. Usually used to standardise
the appearance and behaviour of a web page. Details are in the CSS file not in the HTML file:
Promotes simpler HTML so can be used in multiple HTML files. Content and formatting are kept
separate (formatting code doesn9t have to be rewritten for every page). Cached by a browser so the
site is quicker to access as the formatting information isn9t reloaded for every page. Changes can be
made to the external style sheet and affect the whole site saving time and promoting consistency on
the website (changes don9t have to be made to every page). Stylesheets can be changed or
formatted specifically for different themes, or different devices to allow different display
characteristics of the same web page on different platforms.

179) JavaScript: A popular programming scripting language that requires a runtime environment,
typically a web browser, to provide its necessary objects and methods. Can be embedded into HTML
with <script> tag to add functionality and interactivity to web pages which allows dynamic content
such as: validation, animation and loading new content. Normally used client-side/in browser to
reduce unnecessary load on the server but is also used server-side as it can be amended and
circumvented. Is normally interpreted so will run in any browser.

180) SEARCH ENGINE: Is a software program that finds information on the web. The software: Builds
indexes, makes use of content as well as meta tags (information not displayed but read by search
engines), uses various algorithms to search the indexes and supports many human languages.
Improvements have allowed success with misspelled searches.

181) SEARCH ENGINE INDEXING (SEI): Is the process of collecting and storing data from websites so
that a search engine can quickly match the content against search terms.

182) PageRank Algorithm (PRA): Developed by Google. Attempts to rank pages according to
usefulness to measure the importance of a site by taking into account: The number of inward or
outward links – the more there are, the better regarded the site. The number of sites that link to the
site. The PageRank of the linking sites – the algorithm iteratively calculates the importance of each
site so that links from sites with a high importance are given a higher ranking than those linked from
sites of low importance.

183) SERVER SIDE PROCESSING: Takes place on the webserver. Data is sent from the browser to the
server, the server processes it and sends the output back to the browser. Takes away the reliance of
the browser having the correct interpreter. It hides the code from the user, protecting copyright
and avoiding it being amended or circumvented which is essential for data security. Puts extra load
on the server. This is at the cost of the company hosting the website. Is best used where it is integral
that processing is carried out. It is often used for generating content. It can be used to access data
including secure data. For this reason any data passes to it has to be checked carefully.

184) CLIENT SIDE PROCESSING: Takes place in the web browser. Doesn9t require data to be sent
back and forth meaning code is much more responsive. Code is visible which means it can be copied.
The browser may not run the code either because it doesn9t have the capability or because the user
has intentionally disabled client side code. Reduces the load on the server. Frees the server to do
more processing. Reduces data traffic. Sends better-quality data to the server. Is best used when it9s
not critical code that runs. If it is critical then it should be carried out on the server. Is also best
where quick feedback to the user is needed – an example being games.

1.4 Data types, data structures and algorithms

How data is represented and stored within different structures. Different algorithms that can be
applied to these structures

1.4.1 Data Types

185) Integer: Whole number values with no decimal part, e.g. 7. -51, 612

186) Real/Floating Point: Numbers with decimal or fractional parts, e.g. 36.8, -13.21, �㔋

187) Character: Single letter, digit, symbol or control code, e.g. D, h, 8, *

188) String: A string of alphanumeric characters, e.g. data, Gh8yU7, ^8*k

189) Boolean: One of two values, e.g. True or False

190) Character Set: The characters or symbols that can be recognised, represented, interpreted,
understood and used by a computer. Each required character is represented by a unique binary code
or number so each symbol is distinguishable from all others. Number of bits used for one character =
1 byte. Normally determinable by reference to the characters available on a keyboard e.g. digits,
letters and symbols. May include control characters. Example Code: ASCII/UNICODE use 8/16 bits
per character. Number of characters will tend to be a power of 2. Allows keys to have different
characters. The more characters required, the more bits in each code used for extended character
sets.

191) ASCII: Each character of the alphabet and some special symbols and control codes are
represented by agreed binary patterns using 8 bits. The number of characters in the character set is
limited to 256, making it impossible to display the wide range of characters for other alphabets or
symbols sets.

192) UNICODE: Is a character set mapping different binary values to characters on screen. Originally
a 16-bit coding system that assigns a unique code to all the possible symbols available throughout
the world. All symbols in different languages, platforms and programs have unique codes. Updated
to remove the 16-bit restriction by using a series of code pages with each page representing the
chosen language symbols. Continues to grow as it is not a fixed size set so supports a very large
number of characters, currently allowing over 100000 symbols represented. It is backward
compatible with ASCII so the original ASCII representations have been included with the same
numeric values.

1.4.2 Data Structures

193) Array: A data structure which contains a set of data items of the same data type grouped under
a single identifier. Is static so size cannot change. Each individual element can be addressed and
accessed directly via its index or subscript i.e. random access. Contents are stored contiguously in
computer memory. Unlike lists, they can be multi-dimensional.

194) Record: Are data stores organised by attributes (fields).

195) List: Are data stores organised by an index.


196) Tuple: Are lists that cannot be modified once set up i.e. immutable.

197) LINKED LIST: Is a dynamic data structure. Uses index values and pointer to sort a list in a
specific way. Can be organised on more than one category. Needs to be traversed until the desired
element is found. To add data to a list, it is added in the next available space and the pointers are
updated accordingly. To remove an item from the list, the pointer in the previous item is set to the
value in the item to be removed, effectively bypassing the removed item. The contents may not be
stored contiguously in memory.

198) GRAPH: Is a collection of data nodes/vertices and the connections/edges set between them.
The connections (edges) can be: directional or bi-directional, directed or undirected, weighted or
unweighted. Can be represented by an adjacency matrix.

199) Stack: Are LIFO (8last in, first out9) dynamic data structures. There are two pointers: a top and a
bottom. The top is often called the stack pointer. Data is added and removed from the top of the
stack. Use the command PUSH to add data and POP to remove data.

200) Queue: Are FIFO (8first in, first out9) dynamic data structures. There are two pointers: a start
and an end. The start is often called the queue pointer. Data is added from the end and removed
from the start of the queue. Use the command PUSH to add data and POP to remove data.

201) TREE: Are dynamic branching data structures. They consist of nodes that have sub nodes
(children). The first node at the start of the tree is called the 8root node9. The lines that join the
nodes are called 8branches9.

202) BINARY SEARCH TREE (BST): One specific kind of tree is a BST, where each node only has
maximum of 2 children from a left branch and a right branch. To add data to the tree, it is placed at
the end of the list in the first available space and added to the tree following the rules: If a child
node is less than a parent node, it goes to the left of the parent. If a child node is greater than a
parent node, it goes to the right of the parent. Depth-first (post-order) traversal: Visit all nodes to
the left of the root node. Visit right. Visit root node. Repeat three points for each node visited. Depth
first isn9t guaranteed to find the quickest solution and possibly may never find the solution if we
don9t take precautions not to revisit previously visited states. Breadth-first traversal: Visit root node.
Visit all direct subnodes (children). Visit all subnodes of first subnode. Repeat three points for each
subnode visited. Breadth first requires more memory than Depth first search. It is slower if you are
looking at deep parts of the tree.

203) HASH TABLE: Enable access to data that is not stored in a structured manner.

204) HASH FUNCTION: Generate an address in a table for the data that can be recalculated to locate
that data.

1.5 Legal, moral, cultural and ethical issues

The individual moral, social, ethical and cultural opportunities and risks of digital technology.
Legislation surrounding the use of computers and ethical issues that can or may in the future arise
from the use of computers

1.5.1 Computing related legislation

205) Data Protection Act (DPA) 1998: Is designed to protect personal data and focuses on
controlling the storage of data about the data subject. There are eight provisions: Data must be
lawfully collected and processed so that individual rights are not flouted. Data should only be used
for the specified purpose to the Data Protection Agency and should not be disclosed to other parties
without the necessary permission. Data collected should not be excessive so that irrelevant data is
not stored. Data should be accurate and up to date. Data should not be kept longer than necessary.
Individuals have the right to see the data and to ask for it to be corrected if it is wrong so that they
are not responsible for incorrect data. Data should be protected by adequate security measures so
that people with malicious intent cannot gain access. Data should not be transferred out of the EU
so that data remains subject to DPA. The Act also includes requirements that: Data can only be
accessed or changed by authorised people (the data controllers) so that malicious alterations are not
made. Authorised people must be notified to the Data Protection Register so that they are
accountable. There are some exemptions to the Act: Any data processed in relation to national
security, used to detect or prevent crime or to assist with the collection of taxes and used solely for
individual, family or household use is exempt from the Act.

206) Computer Misuse Act (CMA) 1990: This is the law aimed at illegal hackers who exploit
weaknesses in a system. Under this Act it is an offence to gain unauthorised access to computer
material: With intent to commit or facilitate commission of further crimes. With intent to impair, or
with recklessness as to impairing, operation of computer, etc. (e.g. distributing viruses).

207) Copyright Design and Patents Act (CDPA) 1988: Any individual or organisation that produces
media, software or other intellectual property has their ownership protected by this Act. This means
other parties are not allowed to copy, reproduce or redistribute it without permission from the
copyright holder. Illegal copying and distribution of software prevents the developer of that
software from receiving some or all of their earnings for their work.

208) Regulation of Investigatory Powers Act (RIPA) 2000: This Act is about criminal and terrorist use
of the internet. It regulates how the authorities can monitor our actions. Under this Act certain
organisations can: Demand ISPs provides access to a customer9s communications. Allow mass
surveillance of communications. Demand ISPs fit equipment to facilitate surveillance. Demand
access be granted to protected information. Allow monitoring of an individual9s internet activities.
Prevent the existence of such interception activities being revealed in court.

1.5.2 Moral and ethical Issues

209) Computers in the Workforce: Computers in the workplace have changed the skillsets required
by the workforce: Robot manufacturing means fewer direct manufacturing roles and more technical
support roles maintaining the systems. Online shopping means fewer high-street jobs; more
distribution-centre jobs. Online banking has seen the closure of high-street branches.

210) Automated Decision Making: Is an area where some roles can only be performed by
computers: Electrical power distribution requires rapid responses to changing circumstances. Plant
automation. Airborne collision avoidance systems. Automated decision-making also affects the
public in other ways: Credit assessments in banks are carried out by automated systems. Stock-
market dealing is often automated and is believed to have contributed to the 8flash crash9 in 2010.
The quality of automated decision-making depends upon the quality and accuracy of the data
available and the precision of the algorithm.

211) Artificial intelligence (AI): A grim view of the potential for artificial intelligence is that it could
end mankind, but there are many benefits too. There are numerous examples where AI is used on a
daily basis, including: credit-card checking that looks for unusual patterns in credit-card use to
identify potential fraudulent use, speech recognition systems that identify keywords and patterns in
the spoken word to interpret the meaning, medical diagnosis systems used to self-diagnose illness
from the symptoms and to support medical staff in making diagnoses. control systems that monitor,
interpret and predict events. Expert systems are an example of AI and all have a similar structure: a
knowledge base that holds the collected expert knowledge, usually as 8IF THEN9 rules an inference
engine that searches the knowledge base to find potential responses to questions and an interface
to connect with the user or to the system it is controlling.

212) Environmental Effects: Computers are made from some pretty toxic materials, including:
airborne dioxins, polychlorinated biphenyls (PCBs), cadmium, chromium, radioactive isotopes,
mercury. These need to be handled with great care when disposing of old equipment. Redundant
computer equipment is often shipped off to countries with lower environmental standards. In some
cases, children pick over the waste to extract metals that can be recycled and sold.

213) Censorship and the Internet: Censorship is the deliberate suppression of what can be accessed
or published. What we each regard as unacceptable will vary; some things most people can agree
on, others maybe not. In some countries censorship is applied for political reasons. In some
organisations, e.g. schools, censorship is applied beyond that which applies nationally to protect the
individuals from material regarded as unsuitable by the organisation.

214) Monitor Behaviour: CCTV is often used in the workplace to monitor behaviour, but monitoring
of workplace behaviour can extend beyond this. An organisation may track an individual9s work to
make sure they are on target. Organisations may monitor social networks to ensure behaviour
outside work is acceptable too.

215) Analyse Personal Information: Analysing data about an individual9s behaviour is used
extensively to: predict market trends, identify criminal activity and identify patterns to produce
effective treatments for medical conditions.

216) Piracy and Offensive Communications: Communications Act (CA) 2003: This Act makes it illegal
to 8steal9 Wi-Fi access or send offensive messages or posts. Under this Act, in 2012, a young man was
jailed for 12 weeks for posting offensive messages and comments about the April Jones murder and
the disappearance of Madeleine McCann.

217) Layout, Colour Paradigms and Character Sets: Equality Act (EA) 2010: This Act makes it illegal
to discriminate against individuals by not providing a means of access to a service for a section of the
public. This means web service provides have to make websites more accessible. Some measures
might include: Making it friendly to screen readers, larger fonts or a screen magnifier option, image
tagging, using alternate text for images, choice of contrasting colours in its colour schemes to take
colour blindness into account and transcripts of sound tracks or subtitles.

COMPONENT 2
2.1 Elements of computational thinking

Understand what is meant by computational thinking

218) Caching: Data that has been used is stored in cache/RAM in case it is needed again. Allows
faster access for future use.
219) Reusable Program Components: Software is modular, an example being an object/function.
Modules can be transplanted into new software or can be shared at run time through the use of
program libraries. Modules already tested so more reliable programs. Less development time as
programs can be shorter and modules can be shared.

220) CONCURRENT PROCESSING: Carrying out more than one task at a time/a program has multiple
threads. Multiple processors/Each thread starts and ends at different times. Each processor
performs simultaneously/Each thread overlaps. Each processor performs tasks independently/Each
thread runs independently.

2.2 Problem solving and programming

How computers can be used to solve problems and programs can be written to solve them

2.2.1 Programming techniques

221) Sequence: All instructions are executed once in the order in which they appear.

222) Iteration: A section of code loops or a group of statements or instructions are executed
repeatedly for a fixed or set number of times or until a condition is met.

223) While Loop: Can be controlled at the entry point. The condition is tested before each iteration
and the statements in the loop run repeatedly if the condition is true. Useful when you don9t know
how many iterations are needed. The statements in the loop may not be executed at all if the
condition is initially false.

224) Repeat Until Loop: Can be controlled at the exit point. These loops must execute at least once.

225) For Loop: Can be controlled by an automatic counter. The number of iterations is fixed
according to start and end values of a variable set up at the beginning. These loops often have the
ability to count in steps other than 1 or even backwards.

226) Selection: A condition is used to determine which of the sections of code (if any) will be
executed. As a result some instructions may not be executed.

226) If Statement: Gives 2 options at a time.

227) Select Case Statement: There are many alternatives based on multiple values of the same
variable/one expression used to decide which of a number of statement blocks is executed. There
can be a default option. Makes the code clearer, more readable and easier to write. Easier to add
more options. Allows multiple branches to avoid nested IFs. Avoids numerous repeats of similar
conditions.

228) RECURSION: When a subprogram/subroutine/function/procedure calls itself from within the


function. The original call is halted until subsequent calls return. Eventually reaches a stopping
condition or a base case. This can be another way to produce iteration. Humans often express the
problem in a recursive way. Can be closer to natural language description so more natural to read.
Quicker to write/generally less lines of code as some functions are naturally recursive. Suited to
certain problems for example those using trees. Can reduce the size of a problem with each call.
Creates new variables for each call therefore is less efficient use of memory. Is more likely to run out
of stack space/ memory due to too many calls causing it to crash. This can be avoided with tail
recursion. Algorithm may be more difficult to trace/follow/debug as each frame on the stack has its
own set of variables. Requires more memory and resources than the equivalent iterative algorithm.
Usually slower than iterative methods due to maintenance of the stack.
229) Global Variable: A variable that is usually declared and defined at the start of a program
outside subprograms. Is visible and available everywhere throughout the whole program in all
modules including functions and procedures. Used where a value needs to be accessible from
various parts of the program. Allows data to be shared between modules. It is the same value
irrespective of the place where it is accessed e.g. today9s date, VAT rate, π. Makes it difficult to
integrate modules. It increases the complexity of a program. It may cause conflicts with names
written by others or in other modules. It may be changed inadvertently when the program is
complex. Global variables are dangerous. They can easily be accidentally altered.

230) Local Variable: A variable declared and defined within one subroutine and is only accessible
and visible within that subsection of the program or construct where it is created. Help to make each
function or procedure reusable. Can be used as parameters. Is destroyed or deleted when the
subprogram exits so the data contained ceases to exit at end of module once the execution of that
part of the program it is in is completed. The same variable names within two different modules will
not interfere with one another to allow the same identifier to be used for different purposes without
overwriting values or causing errors e.g. loop counter. Local variables with the same name as global
variables will override and take precedence over the values in the global variable.

231) Modularity: Program is divided into separate, self-contained, specific modules or tasks.
Modules can be subdivided further into smaller modules. Each individual module is a small part of
the problem and focuses on a small sub-task so is easy to solve, understand, test, debug and read
before integration e.g. by a third party. Easy to maintain, update and replace a part of the system as
the whole program will be well structured with clearly defined interfaces without affecting the rest.
Development takes place in parallel and can be shared between a team of programmers so the
program is developed faster and easier to monitor progress. Modules can be allocated according to
programmers9 individual strengths and area of expertise improving the quality of the final product.
Different modules can be programmed in different languages suitable for the application as
appropriate. Reduces the amount of code that needs to be produced because code from other
programs can be reused or standard library routines can be used reducing time of development.
Modules must be linked and programmers must ensure that cross-referencing is done. Interfaces
between modules must be planned and testing of links must be carried out.

232) Function: Performs a specific task or calculation and returns a single data type value when
called so it can be used within expressions as a variable. It is often called using its identifier as part of
an expression in the main program. The value returned replaces the function call so that it can be
used in line the same way as a variable in the main body of the program. Uses local variables. Most
programming languages nowadays use functions.

233) Procedure: Is given an identifier and performs a task but it does not necessarily have to return
a value by parameter passing. Receives and usually accepts parameter values so it can be used
multiple times with different data. Can be called from the main program or another procedure.
When called the code in the procedure is executed and then control is passed back to the parent
program or where the procedure is called from when complete. Is used as any other program
instruction or statement in the main program. Uses local variables.

234) Parameter: A description or information about an item of data which is supplied to a


subroutine when it is called. It is given an identifier or name when the subroutine is defined. It is
substituted by an actual value or address when the subroutine is called. May pass values between
functions or procedures by reference or by value. Used as a local variable within the subroutine.
235) By Val: A copy is made of the actual value of the variable and is passed into the procedure.
Does not change the original variable value. If changes are made, then only the local copy of the
data is amended then discarded. Therefore, no unforeseen effects will occur in other modules.
Creates a new memory space.

236) By Ref: The address/pointer/location of the value is passed into the procedure. The actual
value is not sent or received. If changed, the original value of the data is also changed when the
subroutine ends. This means an existing memory space is used.

237) Integrated Development Environment (IDE): A single program used for developing programs
made from a number of components. IDEs often provide features for: editing, program building,
version control, debugging, testing and compilation.

238) Debugging Tools: allow inspection of variable values, this can allow run-time detection of
errors. Code can be examined as it is running which allows logical errors to be pinpointed. IDE
debugging can produce a crash dump, which shows the state of variables at the point where an error
occurs. It can display stack contents which show the sequencing through procedures or modules.

239) Translator Diagnostics: reports and especially picks up when syntax errors are made. Suggests
solutions and informs the programmer who can then correct the error and translate again but
sometimes the error messages are incorrect or in the wrong place.

240) Breakpoint: used to test the program works up to specific points or at specified lines code.
Allows the programmer to check current values of variable contents at chosen strategic points. Can
set a predetermined point where the program stops running or halts in execution in order to inspect
its state.

241) Variable Watch: monitors the status of variables and objects as you step through code and
cause they program to halt in execution if a condition is met such as a variable changing.

242) Stepping: can set the program to step through code one statement at a time. Slows down
execution to observe path of execution and changes to variable values. Allows the programmer to
watch the effect of each line of code in turn to find the point where an error occurs. Can be used
with breakpoints or variable watches.

2.2.2 Computational methods

243) COMPUTATIONAL METHODS: Could be used to break the problem down into sections:
Statistics can be compiled. Models of new situations can be produced. Simulations can be run by a
computer. Variables can be used to represent data items. Algorithms can be devised to test possible
situations under different circumstances. Features that make a problem solvable by computational
methods: Involves calculations as some issues can be quantified - these are easier to process
computationally. Has inputs, processes and outputs. Involves logical reasoning.

244) PROBLEM DECOMPOSITION: Splits problem into sub-problems. Splits these problems further
until each problem can be solved. Allows the use of divide and conquer. Increase speed of
production. Assign areas to specialities. Allows use of pre-existing modules. Allows re-use of new
modules. Need to ensure subprograms can interact correctly. Can introduce errors. Reduces
processing/memory requirements. Increases response speeds of programs. Order of execution
needs to be taken into account – may need data to be processed by one module before another can
use it. Some modules need to be accessible in an unpredictable way. Large human projects benefit
from the same approach.
245) DIVIDE AND CONQUER: Can be used within a task to split the task down into smaller tasks that
are then tackled.

246) Abstraction: Involves the process of separating ideas from particular instances/reality. It is a
representation of reality using symbols to show real-life features or removing unneeded
complexities, design, detail, elements, features, programming and computational resources which
would require unnecessary programming, design effort and extra memory/resources and could
detract from the main purpose of the program. Allows a computational solution. Examples of
abstraction: a variable, a data structure, a network address, layers in a network, a symbol on a map.

247) BACKTRACKING: A strategy for moving systematically towards a solution. It involves looking at
a progression through the stages of solving the problem. If the pathway fails at some point, it goes
back to the last successful stage. It is well exemplified by logic programs; easy to see in some Prolog
programs and when traversing a tree. It is also exemplified by repair strategies for fixing computers:
save state, then try fix and if no fix, then go back to last saved successful state.

248) DATA MINING: A process which involves searching through vast quantities or large amounts of
unconnected data. May be from different databases: pattern matching algorithms, anomaly
detection algorithms, cluster analysis and regression analysis/calculation of correlation. There may
be no predetermined matching criteria. A brute force approach is possible with high speed
computers. Many applications, e.g. business modelling, determining the characteristics of spam for
email filters or purchasing habits. Attempts to show relationships between facts, components and
events that may not be immediately obvious. Used to plan for future eventualities. Takes vast
processing requirements so needs powerful computers.

249) HEURISTICS: Use 8rule of thumb9 /educated guess approach to arrive at a solution when it is
unfeasible to analyse all eventualities. Scan for the data most likely to help and make a judgement
based on past experience. Look at the likelihood of a good solution. Are useful when there are many
ill-defined variables. Can produce a 8good enough9 solution although the result is not 100% reliable.
Are used to assess potential malware: looks at behaviour rather than structure so can uncover
suspicious activity even if produced in a novel way. Examine the susceptibility of the system to
possible attacks. Simulate the possible effects of a suspected virus. Sometimes decompile the
suspicious program, then analyse the resulting source code. Are used to speed up the process of
finding the solution in the A* algorithm. It is best not to rely too much on heuristics for life-and-
death scenarios.

250) PERFORMANCE MODELLING: One of many instances of modelling in computer systems.


Example of abstraction - performs virtual actions. Computer systems that predict behaviour before
implementation. Useful where implementation is expensive or dangerous. Makes use of existing
data to make predictions. Randomness built in where real-life parameters are not fully understood,
e.g. climate modelling.

251) PIPELINING: Data/processes are arranged in a series directing the output of one process into
the input of the next. Looks for processes that can be processed at the same time – processes that
must be sequential. Analogous to a factory production line in real life. Product is passed from one
process to the next so all processes can proceed at the same time. Used in: Queuing up instructions
to the processor. Pipes to pass data between programs, example such as | symbol in Unix, Popen()
or pipe() in C. Graphics pipelines. Can allow simultaneous processing of instructions where the
processor has multi-cores.
252) VISUALISATION: A computer process presents data in an easy-to-grasp way for humans to
understand. Trends and patterns can often be better comprehended in a visual display. Graphs are a
traditional form of visualisation. Computing techniques allow mental models of what a program will
do to be produced.

2.3 Algorithms

The use of algorithms to describe problems and standard algorithms

2.3.1 Algorithms

253) COMPLEXITY: Is a measure of how much the time, memory space or resources needed for an
algorithm increases as the data size it works on increases. Represents the average complexity in Big-
O notation.

254) BIG-O NOTATION: just shows the highest order component with any constants removed.
Shows the limiting behaviour of an algorithm to classify its complexity. Evaluates the worst case
scenario for the algorithm.

255) CONSTANT COMPLEXITY: This is where the time taken for an algorithm stays the same
regardless of the size of the data set. For example, printing the first letter of a string. No matter how
big the string gets it won9t take longer to display the first letter.

256) LINEAR COMPLEXITY: This is where the time taken for an algorithm increases proportionally or
at the same rate with the size of the data set. For example, finding the largest number in a list. If the
list size doubles, the time taken doubles.

257) POLYNOMIAL COMPLEXITY: This is where the time taken for an algorithm increases
proportionally to n to the power of a constant. Bubble sort is an example of such an algorithm.

258) EXPONENTIAL COMPLEXITY: This is where the time taken for an algorithm increases
exponentially as the data set increases. Travelling Salesman Problem is an example of such an
algorithm. The inverse of logarithmic growth. Does not scale up well when increased in number of
data items.

259) LOGARITHMIC COMPLEXITY: This is where the time taken for an algorithm increases
logarithmically as the data set increases. As n increases, the time taken increases at a slower rate,
e.g. Binary search. The inverse of exponential growth. Scales up well as does not increase
significantly with the number of data items.

260) Bubble Sort: Is intuitive (easy to understand and program) but woefully inefficient. Uses a temp
element. Moves through the data in the list repeatedly in a linear way. Start at the beginning and
compare the first item with the second. If they are out of order, swap them and set a variable
swapMade true. Do the same with the second and third item, third and fourth, and so on until the
end of the list. When, at the end of the list, if swapMade is true, change it to false and start again;
otherwise, If it is false, the list is sorted and the algorithm stops.

261) Insertion Sort: Works by dividing a list into two parts: sorted and unsorted. Elements are
inserted one by one into their correct position in the sorted section by shuffling them left until they
are larger than the item to the left of them until all items in the list are checked. Simplest sort
algorithm. Inefficient and takes longer for large sets of data.

262) MERGE SORT: Works by splitting n data items into n sub-lists one item big. These lists are then
merged into sorted lists two items big, which are merged into lists four items big, and so on until
there is one sorted list. Is a recursive algorithm so may require more memory space. Is fast and more
efficient with larger volumes of data to sort.

263) QUICK SORT: Uses divide and conquer. Picks an item as a 8pivot9. It then creates two sub-lists:
those bigger than the pivot and those smaller than it. The same process is then applied recursively to
the sub-lists until all items are pivots, which will be in the correct order. (As this recursive algorithm
can be memory intensive, iterative variants exist.). Alternative method uses two pointers. Compares
the numbers at the pointers and swaps them if they are in the wrong order. Moves one pointer at a
time. Very quick for large sets of data. Initial arrangement of data affects the time taken. Harder to
code.

264) DIJKSTRA’S SHORTEST PATH ALGORITHM: Finds the shortest path between two nodes on a
graph. It works by keeping track of the shortest distance to each node from the starting node. It
continues this until it has found the destination node.

265) A* ALGORITHM: A* Search is an improvement on Dijkstra9s algorithm. By using a heuristic to


estimate the distance to the final node, it is possible to find the shortest path in less time. It works
like Dijkstra9s but instead of just using the distance from the start node to each node, it uses the
distance from the start node plus the heuristic estimate to the end node. Chooses which node to
take next using the shortest distance + heuristic. All adjoining nodes from this new node are taken.
Other nodes are compared again in future checks. Assumed that this node is a shorter distance.
Adjoining nodes may not be shortest path so may need to backtrack to previous nodes.

266) Binary Search: Requires the list to be sorted in order to allow the appropriate items to be
discarded. It involves checking the item in the middle of the bounds of the space being searched. It
the middle item is bigger than the item we are looking for, it becomes the upper bound. If it is
smaller than the item we are looking for, it becomes the lower bound. Repeatedly discards and
halves the list at each step until the item is found. Is usually faster in a large set of data than linear
search because fewer items are checked so is more efficient for large files. Doesn't benefit from an
increase in speed with additional processors. Can perform better on large data sets with one
processor than linear search with many processors.

267) Linear Search: Start at the first location and check each subsequent location until the desired
item is found or the end of the list is reached. Does not needed an ordered list and searches through
all items from the beginning one by one. Generally performs much better than binary search if the
list is small or if the item being searched for is very close the to start of the list. Can have multiple
processors searching different areas at the same time. Linear search scales very with additional
processors.

You might also like