Training Guide Final
Training Guide Final
Both problems are divided into subtasks (for a total score of 200,
100 for each problem). What this means is that, depending on the
efficiency of the Algorithm/Data Structure you use, your program
could work on a partial score, but not be efficient enough to get a
full 100.
Soon enough, you could make a nice little folder with all the
problems you massacre and brag about it later. :D
Searching and Algorithm Analysis
Example
Let our array be {3, 4, 1, 5, 2}. If I ask you the location of the
number ‘3’, you return 1*. If I ask you the index of the number ‘6’,
you return 0 i.e false **.
Seems simple enough right? Mhm, it is! Let’s take our first look at
two Algorithms — Linear Search and Binary Search — see what
makes them so different and when to use them.
Linear Search
The idea of Linear Search is simple, we start at the first index and
check the number at that index. If we find our number, we say
“Found it Aulene-san” and return the index, and if we don’t, we say
“It’s not here ;c” and return false (0).
Example
Our array A = {3, 4, 1, 5, 2} and I have asked you the location of the
number ‘1’ in A.
We start from index A[1] and check the number there. It’s 3, so we
go on checking. A[2] is 4, A[3] is… what we’re looking for. :D
Return that index i.e 3.
What if the number we’re looking for is ‘7’? Well then we check
A[1], A[2] .. A[5]. No dice. We return 0.
Hint: Think of how many times the for loop will run before
returning a value.
In simpler terms, the block of code inside the Linear Search for
loop runs N times, with each execution being called one operation
i.e O (1) . So, we say that the complexity of our code is O (1× N ) or O ( N ) .
Fear not, you’ll get familiar with the idea as you solve more
problems.
1. Left = 1
2. Right = N
3. Mid = (Left + Right) / 2
If A[Mid] > V, we can say that all elements from Mid to N (i.e
A[Mid], A[Mid+1] … A[N]) are greater than V, hence we don’t need
to look for V between Mid and N.
If A[Mid] < V, we can say that all elements from 1 to Mid (i.e
A[1], A[2] .. A[N]) are lower than V, hence, we don’t need to look
for V between 1 and N.
At A[Mid], we find 40. Since 40 < 50, we can safely say that A[1] upto A[Mid]
are also lower than 50. Why? Because our array is sorted. (Don’t worry, we’ll
discuss sorting soon)
We update Left to be equal to Mid+1 i.e equal to 5 and repeat the process,
making Mid = (5+7)/2 = 6.
At A[Mid], we find 60. Since 60 > 50, we can safely say that A[Mid] upto
A[Right] are also greater than 50.
This time, we update Right to be equal to Mid-1 i.e equal to 5 and repeat the
process, making Mid = (5+5)/2 = 5.
Check A[5], we see A[5] = 50, which is the number we were looking for. We
return the index 5.
What if we were looking for a number not present in the array i.e 55? Well,
the process would go just about the same as above, except on the last step, we
would return 0 since A[5] =/= 50.
Because I’m a super-nice person, I’ll give you the code for Binary Search, but
make sure you implement it on your own before reading further. *virtual
pinky promise?*
But Aulene-senpai, I already know Linear Search, why do I need to
know Binary Search?
Patience, youngling.
Well, it turns out Binary Search is a LOT faster than Linear Search.
This is because we’re eliminating almost half of our possible indices
in one operation, rather than looking through each and every
element like we did in Linear Search.
1. BSEARCH1
2. FORESTGA
3. AGGRCOW
An Introduction to Recursion
You’ve all played with Legos, right? These little fun-filled delightful little
creatures until you step on them and see your life flashing before your eyes?
Anyway, so Legos start small; you put one block on top of the other until you
build the Walt Disney castle :3.
Recursion is similar to Legos. You take a very small problem i.e a
‘block’ and use it to solve/find the answer to larger problems (i.e A
Disneyland made of Lego blocks. I’m sorry I had a sheltered childhood.)
A Simple Series
F ⎡⎣ i ⎤⎦ = F ⎡⎣ i −1⎤⎦ + F ⎡⎣ i − 2 ⎤⎦
F ⎡⎣ 0 ⎤⎦ = 0
F ⎡⎣1⎤⎦ =1
Here’s a very beautiful and intricately crafted tree in order for you
to understand this -
1. TSORT
2. INCPRO4
3. BOOKSHELVES - ZCO (Sorting)
4. VARIATION - ZCO (Sorting)
5. WORMHOLES (Sorting + Binary Search) - ZCO
6. PYRAMID - (Sorting + Binary Search)
Prefix Sums
Alright, so Prefix Sums are super important. Well, Aulene-kun,
what are Prefix Sums?
Prefix Sums can be defined as, for the ith element in the in an
array, Prefix[i] is the sum of all elements upto and including the
nth element.
Prefix[1] = A[1]
Prefix[i] = Prefix[i-1] + A[i]
Example
Let’s say we have an array A = [1, 3, 5, 6, 10].
Going by our definitions, Prefix = [1, 4, 9, 15, 25].
Alright, so think back, how would you use prefix sums to compute
sums in constant time?
1. CSUMQ
2. JURYMARKS
Anyway, you need to know prefix sums for what’s about to follow.
It’s time for my favourite section, the most important one for the
INOI and relatively tough so be prepared for l o t s of problems.
If you felt that the guide was like a low-effort YouTube video up
until now, it’s because everything upto this point doesn’t matter
nearly as much as what the following does.
Really, if you want to clear the INOI, these next 2 sections are
everything. By everything, I mean..
E V E R Y T H I N G.
Scene 6
Enter: Dynamic Programming
Here’s the problem with Recursion. Let’s look at the recursive calls
for the Fibonacci problem again, shown below.
We can see that, unless we hit F(0) or F(1), we don’t return any
value. For example, we have to compute F(3) over and over again.
Don’t you think it would be faster if we somehow ‘remembered’ the
value of F(3)?
Let’s take it one step further: what if, every single time we
computed a new value for F[], (be it F[3], F[4], F[whatever]), we
stored that value somewhere and returned it as needed?
Here’s how our fibonacci calls look with the optimisation:
So here’s what you do. You solve each of these problems in the
order that I give you, and you solve them all. Fair warning, you’re
going to get a lot of problems here on out. Chit-chat’s over.
The first thing I want you to do before crossing names off a list like
Oliver Queen (Arrow reference, anyone?) is to head over to the
following links and read about all sorts of different Dynamic
Programming problem types. They’re explained beautifully, and do
a better job of explaining than my little hands ever could. DON’T
read the code presented at ANY of those links.
Trust me, you do this and the problems that follow, you’ll make
your DP lives easier by miles.
Give a lot of time to all of the following problems. There’s a lot to
learn here.
Also, STOP.
If you haven’t done so already, I recommend you check out my
document on STL. Graphs will require heavy STL implementations,
thus it’s a good time to know basic data structures and STL now.
Playing with Graphs by Amit M. Agrawal Austin
LegendFlame (I dare you to Google ‘Austin LegendFlame’)
Let’s answer the first question: what are graphs? Well, graphs are
basically a set of objects, some of which are linked to each other.
Example
Let’s say we have five cities, Paris (City 1), New Delhi (City 2),
New York (City 3), London (City 4) and Shanghai (City 5). I have
to go to New York (City 3), and I’m currently in New Delhi (City
2). Here’s the flight plan for today:-
• Node: The cities on our graph i.e New York, New Delhi, etc are
called nodes. They are the objects on our graph.
• Edge: See the connections between cities? They’re called edges.
They represent a link between two objects, in this case, a flight
going from one city to another. You’ll encounter two types of
edges, Directed and Undirected.
• Directed Edge: This implies that there is a one-way connection
between two nodes. For example, in our graph, there is an edge
from Paris to Shanghai. You cannot use the same edge to travel
from Shanghai to Paris.
• Undirected Edge: This kind of edge symbolises a two-way
connection, meaning that you can use an edge from one node to
another in any direction.
Here’s how our graph would look like if all the edges were
undirected edges.
The above graphs are unweighted graphs. This means that
traversing any edge has no cost associated with it. However,
weighted graphs have a cost associated with traversing each edge.
In our example example, the edge between two cities could have a
weight equal to the distance between them. This would make our
graph look something like this -
Now, how do we represent graphs? Glad you asked. Turns out,
there are two traditional ways to represent graphs: the Adjacency
Matrix and the Adjacency List.
1 2 3 4 5
2 6597 ∞ ∞ ∞ 4245
3 5837 ∞ ∞ 5585 ∞
4 ∞ ∞ 5585 ∞ 9199
1 0 1 1 0 1
2 1 0 0 0 1
3 1 0 0 1 0
4 0 0 1 0 1
5 1 1 0 1 0
The Adjacency List Representation
1 2 3 4 5
2, 6597 1, 6597 1, 5837 3, 5585 1, 9257
3, 5837 5, 4245 4, 5585 4, 9199 2, 4245
5, 9257 4, 9199
Why do we need to know the Adjacency List representation
AND the Adjacency Matrix representation?
Well, you see, if you ever had to check if an edge exists between two
nodes, it’s complexity would be O ( N ) in an Adjacency List, while it
would be O (1) in an Adjacency Matrix.
Graph Traversal
Now that you know how to represent a graph, we can move on to
how you should explore a graph. What’s ‘Traversal’? In simple
terms, think of nodes as ‘cities’ and edges as ‘roads’ that are
connecting those cities. We use these roads to go from one city to
another. In formal terms, this is exactly what traversal is.
Example Case:
u = 1, v = 4. The answer for this case will be 3, with the shortest
path going from 1 -> 3 -> 5 -> 4.
Spoilers on how to solve this problem follow.
Let’s see.. in order to find the shortest path from A to B, we need to
explore A’s neighbours, correct?
At the end of this ‘process’, here’s how our distance[] array will
look like.
1 2 3 4 5
3 3 2 0 1
Also, because I’m really nice, here’s the code for the Depth-First
Search on an Adjacency Matrix.
Complexity Analysis of Depth-First Search
Adjacency List Adjacency Matrix
O (V + E ) O(V 2 )
Using DFS, we start the traversal from the root node and explore
the search as far as possible from it (aka ‘depth-wise’). We keep on
exploring a node’s neighbours until we hit a leaf node or a node
which isn’t connected to any non-visited nodes. Exploration of a
node is suspended as soon as another unexplored node is found.
Problems -
1. TASKFORCE
2. GREATESC
3. PPRIME (Hint: Sieve of Eratosthenes)
4. CRITINTS
5. FENCE (INOI’17)
6. PT07Y
7. PT07Z
8. PARTY
9. NEWREFORM
10. SPECIES
Shortest Paths
Simply put, in a graph, what is the shortest ‘distance’ between two
nodes U and V, where the distance between two nodes is defined as
the sum of edge weights in the path from U to V.
Sample Case:
U = 1, V = 4.
Answer is 11422 km. (Path is 1 -> 3 -> 4)
( )
The complexity of the above algorithm would be O N 2 . However,
we can reduce this by using a max-heap, which would reduce the
complexity to O ( NLogN ) . The following is how I implement
Dijkstra’s.
Problems -
1. EZDIJKST (Dijkstra’s Sample Problem)
2. SEQUENCELAND
3. WEALTHDISPARITY
4. SHPATH
5. WORDHOP
6. CLIQUED
Epilogue
So… that’s about it from what you’ll learn from this. Hope you had
fun.
You’ve done more than 50 problems just from this guide, each one
harder than the next. Even if you’ve not, you still have time to do
them now.
There’s a lot you can learn now - start with Segment Trees. Or try
your hand at a couple IOI problems. Whatever floats your boat :P,
but do have fun. I also hope that you, with whatever little skill you
have, give back to the programming community.
If you have any critique for this guide or any questions, please
email me at [email protected], or contact me over Facebook.