0% found this document useful (1 vote)
553 views13 pages

Building A Basic Sudoku

This document provides an overview of building a basic Sudoku solver in Excel using iterative formulas. It describes setting up the necessary boards, including an input board, solution board, and valid values board. Formulas are used to populate the valid values board with numbers 1-9 based on row and column. Named ranges are created to make the formulas more readable and reusable across different cells. The solution board is initially populated with values from the input board. Examples of puzzles and partial solutions are provided. The overall goal is to create a simple Sudoku solver that functions through iterative calculation of formulas alone.

Uploaded by

Marian Bogdan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
553 views13 pages

Building A Basic Sudoku

This document provides an overview of building a basic Sudoku solver in Excel using iterative formulas. It describes setting up the necessary boards, including an input board, solution board, and valid values board. Formulas are used to populate the valid values board with numbers 1-9 based on row and column. Named ranges are created to make the formulas more readable and reusable across different cells. The solution board is initially populated with values from the input board. Examples of puzzles and partial solutions are provided. The overall goal is to create a simple Sudoku solver that functions through iterative calculation of formulas alone.

Uploaded by

Marian Bogdan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Building a Basic, Understandable

Sudoku Solver Using Excel Iterative


Calculation Part 1/2
by Diego Oppenheimer, on September 30, 2008
| 0 Comments | 51

Share

Todays author, Charlie Ellis, a Program Manager on the Excel team, shares a
spreadsheet he built in Excel for solving Sudoku puzzles. The spreadsheet can be
found in the attachments at the bottom of this post.
For those of you who dont already know, Sudoku is a type of logic puzzle (that I was
completely addicted to about three years ago) that requires you to place the numbers
1-9 into a grid obeying certain rules (lots more information on Sudoku is available on
the web).
A while back, a fellow PM on the Excel team, Dan Cory, wrote a spreadsheet for
solving Sudoku puzzles using Excel formulas and made it available on Office Online
(here). Dans spreadsheet was great in that, unlike many of the Sudoku solving
spreadsheets out there, it didnt use any VBA or other scripting to do the work of
solving the puzzles, and relied instead on the iterative calculation feature of Excel. Its
quite cool and has been a popular download, but one thing about the spreadsheet that I
wanted to see if I couldnt improve upon was just how complicated it is. In fact, Dan
made every single cell its own different formula, and he ended up having to use VBA
to create the formulas because maintaining and debugging it without VBA to write all
those different formulas in an automated way was impossible.
As soon as I saw Dans spreadsheet, I wanted to make my own version of a Sudoku
solver that not only used only formulas, but also one where the formulas were

relatively understandable and there were a small number of distinct formulas. It turned
out to not be that tough to build, but I think I learned a fair amount trying different
approaches to the problems of making an iterative model like this one perform well
and at the same time be reasonably maintainable and understandable. I think it might
even have turned up a reasonably useful way at looking at abstraction within formulas
given the Excel formula language. Ive always wanted to blog about the process of
creating this spreadsheet and about how iterative formulas work to show the power of
Excels formula language, because it illustrates the usefulness of circular references
and iterative calculation, and because I just think its an incredible amount of fun so
here goes. Lots of people have created more powerful solvers, many as spreadsheets,
some using just formulas, but I wanted to try to explain how you can go about
creating a solver and hopefully share some formula tricks that people find useful.
Pre-reqs
Creating a spreadsheet for solving a Sudoku isnt entry-level spreadsheeting. In
addition to being pretty good with formulas, youll need to understand the concept of
iteration. Chris Rae did a great job of explaining the topic in his earlier post on
Iteration & Conways Game of Life, so Im not going to repeat that, and Ill simply
assume you already understand iteration. Second, were going to make extremely
heavy use of named ranges, and for the stuff Im doing, the new name manager is very
helpful (see Formula building improvements Part 4: Defined Names for some
information about this) and Im going to assume working knowledge of it and of
named ranges generally (though Im going to show some tricks which may be new to
even experienced formula users). Finally, youll need to at least be familiar with array
notation in Excel.
Setting up the boards
For those I havent lost already, Im going to start by creating a series of boards very
much like the ones that Dan Cory used: one 99 board for my input, one 99 board
for the solution, and a 2727 board for the possible values in each box. I do this by
changing the row height, column width, font, and zoom such that all the cells are
small squares and then applying borders and fills to get the following:

The input and solution boards are reasonably straightforward (the input board is the
one in the top left where youll type in a puzzle to be solved, the solution board is
where the correct answer hopefully shows up). The board with possible values, which
Ill call the valid values board, is a bit trickier. It is 2727 because each box in the
input and solution boards is represented by a 33 set of cells in the valid values board.
Each of these nine cells represents whether one of the numbers 1-9 is still in the
running to be the actual value for the corresponding box of the solution board and the
set of possible values for a given cell in the input/solution cell is the set of all the
numbers in a single 33 big cell that are not blank. If it isnt already, the
purpose/use of this board should become clear later. For now, lets fill in all the
possible values from one to nine in each of these big cells.
Filling in the valid value board

We want to do this by creating a single formula that will fill in the various numbers 19 based on which row and column the formula sits in, and then well later add logic to
blank out the numbers that arent valid. This formula is a little more complicated than
the average spreadsheet formula, so Ill first give the whole formula and then break it
down. This looks like the following:

= MOD(COLUMN(A1)-1,3) + 1 + MOD(ROW(A1)-1,3)*3
When this is entered into the top-left cell of the valid values board and then filled into
the entire valid values board, it gives the following results:

Note that youll want to do the filling in with either Paste Special | Formulas or
CTRL-Enter because otherwise youll mess up all the pretty formatting.
Breaking this formula down, ROW and COLUMN return (duh) the row or column of
the reference passed to them as a number. Passing these functions A1, as in this
formula, means theyll give us a number that starts at one and goes up. The first part
of the formula uses the modulus function to transform the column numbers given by

COLUMN into the numbers 0-2, and then adds one to get 1-3. To this we add a 0, 3,
or 6, depending on the row number by using the modulo function on the result of the
ROW function.
Next, because thats a bit of a gnarly formula to have sitting around, and were going
to have to use it all over the place, were going to take this formula and move it out of
the cell and into a named range. This allows us to abstract away all of the logic for this
formula into a single, understandable name. For lack of a better name, Im going to
call it onetonine and it will have the same exact formula we just created. Because
the context for the relative references (i.e. what they take as being the current cell) is
determined by what cell youre in when you create the named range, its critical that
you start off by selecting cell A1, then create the new named range, so that your
formula works everywhere within the sheet.

This is also why we allow gutters of three rows and three columns around all the
boards.
Now we can take our new name and test it out in the board, like so:

Here CTRL+Enter is by far the easiest way to set the formula for all the cells in the
valid values board. First select the whole board, then type in the formula, and instead
of pressing Enter, just hit CTRL+Enter to fill the formula you just typed into all the
cells (without messing up their formatting).
Setting up the solution board
Were going to want to base what valid values are left for a given box on what our
current solution looks like (as opposed to the input), but in order to do that, we need
something in the solution board. To begin with, at least, the solution will definitely
contain all of the numbers in boxes from the input board. Lets start off by doing this
in the simplest possible way, while catching the case of blanks. In the solution board,
lets make the cells there all simply equal the corresponding cell in the input board
using relative references unless the input cell is blank. The absolute easiest way to do
this is with the following formula (shown in the form in which it would be entered
into cell D16):

=IF(D4,D4,)
Again, use CTRL+Enter to fill this into the appropriate cells. Now that we have the
base thing working, lets make it more re-usable and meaningful by using named
ranges.
As we did with the name onetonine, lets abstract the concept of referring to the
correct input cell from any cell in the solution board and turn that into a name. Well
need to do something similar for all the boards at some point, so well start by making
named ranges for each of the boards (I chose in_board, sol_board, and val_board) and
then a name to go from the solution board to the input board (in_cell_from_sol) which

is simply =Main!D4, then use this to change the formula to be =IF(in_cell_from_sol,


in_cell_from_sol, ). Note that this needs to be input from D16.
OK, so far we just made our formula longer, but trust me, this concept becomes a life
saver. Doing the same for valid value cells from solution board cells is only a bit
trickier. The name sol_cell_from_val is:

=INDEX(sol_board, INT((ROW(Main!A1)-1)/3)+1,
INT((COLUMN(Main!A1)-1)/3)+1)
This must be created from cell P4. This formula uses ROW and COLUMN together
with the division operator and INT to convert from the coordinates of the current cell
in the 2727 board to their coordinates in a 99 board, then uses INDEX to get the
cell out of the sol_board corresponding to those coordinates.
A neat way of testing this formula is to click into the Refers to box of the name
manager from different cells in the valid values board. Depending on what cell youre
in youll see dancing ants (a moving highlight) for a different cell hopefully the
corresponding cell in the solution board.

Now that we have some basics, lets put in an actual puzzle and see about getting the
inputs to propagate to the solution board and the valid value board. Heres the puzzle
well use:

After entering it, the solution board should look like the input board. To make the
valid value board work, we use this formula for all valid board cells:

=IF(sol_cell_from_val<>,IF(sol_cell_from_val=onetonine,
onetonine,), onetonine)
This means that the current cell is blanked out if a value exists in the solution cell and
that value isnt the current onetonine value.
This should give you:

Now were ready to do the stuff that will actually help come up with solutions based
on the rules and strategies of Sudoku.
Checking for a number in the rows of the solution board
The main rule of Sudoku is that you cant have two of the same number in any row,
column, or 33 big box. We will start by adding the rule that there cant be more than

one of a number in any row and then working in columns and big boxes. For example,
in the second big cell in the first row, none of the numbers 4, 2, 7 or 9 are possible as a
result of this rule. We can do this by turning blank any cells in the valid values board
for which 1) a solution for the box doesnt exist (which is precisely when we get to the
final onetonine in the formula for valid value cells) and 2) the row of the solution
board contains the number equal to the current value of onetonine. Note that condition
#1 is precisely where the last onetonine shows up (i.e. no solution exists for the
current big cell), so all we have to do is put the logic for #2 there. This logic can be
expressed as:

=IF(COUNTIF(sol_row_from_val, onetonine)>0, , onetonine)


Where sol_row_from_val is:

=INDEX(sol_board, INT((ROW(Main!A1)-1)/3)+1, 0)
Again, this must be entered from P4.
So combining these we get:

=IF(sol_cell_from_val<>,IF(sol_cell_from_val=onetonine,
onetonine,), IF(COUNTIF(sol_row_from_val, onetonine)>0,
,onetonine))
Which, while not simple, is at least understandable and gives you a valid values board
that looks like this:

Extending to columns and (33) big boxes


When we go to add the rules for no two of the same number in a column and no
two of the same number in a big box in this same way, we will run into two
problems: 1. you cant create sol_bigbox_from_val directly using INDEX because
INDEX only returns a cell, row, or column from a range or the whole range and 2. it
will start to get unwieldy to have all three of the COUNTIFs ORd together at the end
of this formula.
To solve the first problem, you can use OFFSET as you could use OFFSET to create
any of the other references here but because OFFSET is volatile this will lead to
performance problems down the road. A better solution is to take the union of two
references that you get from INDEX (using the union operator in Excel the colon) in
order to make a 33 range. This gives us a sol_bigbox_from_val with the following
formula (entered from P4):

=INDEX(sol_board, INT((ROW(Main!A1)-1)/9)*3+1,
INT((COLUMN(Main!A1)-1)/9)*3+1):INDEX(sol_board,

INT((ROW(Main!A1)-1)/9)*3+3, INT((COLUMN(Main!A1)1)/9)*3+3)
By now we can pick this formula apart more easily. The INT, ROW, division part says
that for every nine rows you move in the valid value board, move down by a block of
three rows in the solution board. Theres a similar expression for columns that
accomplishes much the same thing moving across. The second reference is precisely
the first reference, but offset two rows down and two columns across, giving you a
33 box.
Now that we have this, we could write one big formula that covers whether the current
onetonine value already exists in any of the row, column or big box, but lets use
abstraction again here to keep the fundamental formula of the valid value board more
simple. Instead of putting it directly into the formula, lets invent a new name called
solution_in_rcb for does there exist a solution cell with my number in any of the row,
column, or bigbox? This name only ever has to return true or false (doing the test
part of the condition #2 does above) and despite not being short, is actually really
simple to write:

=OR(COUNTIF(sol_row_from_val, onetonine)>0,
COUNTIF(sol_col_from_val, onetonine)>0,
COUNTIF(sol_bigbox_from_val, onetonine)>0)
Taking advantage of this new name makes our new formula for valid value cells:

=IF(sol_cell_from_val<>,IF(sol_cell_from_val=onetonine,
onetonine,), IF(solution_in_rcb, ,onetonine))
Which is not only shorter than this formula had been and much more understandable,
it also results in some clear places where theres only one possible solution:

So we can eyeball some solutions, but the trick now is to feed those into the solution
board. This is where iteration comes in. Next time well use iteration and a few more
formula tricks to solve some Sudokus.
Edit: Updated the sol_bigbox_from_val formula to reflect what it looks like when
entered from the starting cell in P4. Also clarified in a couple other places that the
starting cell should be P4.

You might also like