Mid II: Sudoku Solver

Instructions

Sudoku

The following description is taken from Wikipedia. Sudoku, meaning single number, is a logic-based, combinatorial number-placement puzzle. The objective is to fill a 9×9 grid with digits so that each column, each row, and each of the nine 3×3 sub-grids that compose the grid (also called “blocks”) contains all of the digits from 1 to 9. The puzzle setter provides a partially completed grid, which for a well-posed puzzle has a unique solution. A Sudoku puzzle and its solution are shown below.

Sudoku Sudoku Solved

The following problem is taken from an online Clojure course.

Representation

Let our board be represented by a vector of vectors. Here 0 is used to encode an empty square in a sudoku board.

(def sudoku-board
    [[5 3 0 0 7 0 0 0 0]
    [6 0 0 1 9 5 0 0 0]
    [0 9 8 0 0 0 0 6 0]
    [8 0 0 0 6 0 0 0 3]
    [4 0 0 8 0 3 0 0 1]
    [7 0 0 0 2 0 0 0 6]
    [0 6 0 0 0 0 2 8 0]
    [0 0 0 4 1 9 0 0 5]
    [0 0 0 0 8 0 0 7 9]])

Here is the solution of the above sudoku board:

(def solved-board
    [[5 3 4 6 7 8 9 1 2]
    [6 7 2 1 9 5 3 4 8]
    [1 9 8 3 4 2 5 6 7]
    [8 5 9 7 6 1 4 2 3]
    [4 2 6 8 5 3 7 9 1]
    [7 1 3 9 2 4 8 5 6]
    [9 6 1 5 3 7 2 8 4]
    [2 8 7 4 1 9 6 3 5]
    [3 4 5 2 8 6 1 7 9]])

You’re going to need the set of numbers used in sudoku often, so let’s define them as a set:

(def all-values #{1 2 3 4 5 6 7 8 9})

In sudoku, one needs to fill a board so that every line, every column and every block has the numbers from 1 to 9 exactly once. Blocks are 3x3 areas inside the sudoku board.

;[[5 3 0 | 0 7 0 | 0 0 0]
; [6 0 0 | 1 9 5 | 0 0 0]
; [0 9 8 | 0 0 0 | 0 6 0]
; -------+-------+-------
; [8 0 0 | 0 6 0 | 0 0 3]
; [4 0 0 | 8 0 3 | 0 0 1]
; [7 0 0 | 0 2 0 | 0 0 6]
; -------+-------+-------
; [0 6 0 | 0 0 0 | 2 8 0]
; [0 0 0 | 4 1 9 | 0 0 5]
; [0 0 0 | 0 8 0 | 0 7 9]]

Working with nested structures

get can be used to get elements from vectors. To get from a nested vector, we need to call get twice:

(get (get [["a" "b"] ["c" "d"]] 0) 1)
;=> (get ["a" "b"] 1)
;=> "b"

Getting values from nested structures with get gets ugly. To remedy this, there is (get-in nested-structure directives). This is much like accessing a 2-D array.

(get-in [["a" "b"] ["c" "d"]] [0 1])
;=> (get (get [["a" "b"] ["c" "d"]] 0) 1)
;=> (get ["a" "b"]                     1)
;=> "b"

It works with everything that supports get, including maps and vectors.

(def cities {:title "The City and the City"
             :author {:name "China Miéville"
                      :birth-year 1972}})

(get-in cities [:author :name])
;=> (get (get cities :author)                       :name)
;=> (get {:name "China Miéville", :birth-year 1972} :name)
;=> "China Miéville"

To figure out which numbers are valid for a square we need to know which are already taken. Let’s write a few functions to figure this out.

Exercise 1

Write the function (row-values board coordinates) that returns a set with all numbers on the row of the coordinates

Use destructing inside the parameter vector to get the row.

(row-values sudoku-board [0 2]) ;=> #{0 5 3 7}
(row-values sudoku-board [3 2]) ;=> #{0 8 6 3}

Write the function (col-values board coordinates) that returns a set with all numbers on the col of the coordinates

(col-values sudoku-board [0 2]) ;=> #{0 8}
(col-values sudoku-board [4 8]) ;=> #{3 1 6 0 5 9}

Finally, we need to figure out what numbers are inside the block of a square.

To make working with blocks a little easier, let’s take a small detour.

You can use for to go through a sequence like with map:

(for [number [1 2 3]]
  (+ number 2))
;=> (3 4 5)

Here the name number gets bound to each value of the sequence [1 2 3] one by one. For each value, evaluate the body (+ number 2) with it and collect the results into a sequence.

But you can give for multiple bindings, and it will go through all combinations:

(for [name ["John" "Jane"]
      number [1 2 3]]
  (str name " " number))
;=> ("John 1" "John 2" "John 3" "Jane 1" "Jane 2" "Jane 3")

If you happen to be familiar with list comprehensions from some other language, for is Clojure’s list comprehension form.

To make working with coordinates a bit easier, let’s write a function that returns a sequence of coordinate pairs.

Exercise 2

Write the function (block-values board coordinates) that returns a set with all numbers in the block of coordinates.

Write a helper function that returns the coordinates for the top left corner of the block and another helper function that returns all nine coordinate pairs in the block.

(block-values sudoku-board [0 2]) ;=> #{0 5 3 6 8 9}
(block-values sudoku-board [4 5]) ;=> #{0 6 8 3 2}

The clojure.set namespace has some useful functions for working with sets. (clojure.set/union set1 set2 ...) returns a set containing all the elements of its arguments:

(clojure.set/union #{1 2} #{2 3} #{7}) ;=> #{1 2 3 7}

Another helpful set operation is (clojure.set/difference set1 set2), which returns a set with all elements of set1 except those that are also in set2. Or put another way, removes all elements of set2 from set1:

(clojure.set/difference #{1 2 3} #{1 3})   ;=> #{2}
(clojure.set/difference #{1 2 3} #{2 4 5}) ;=> #{1 3}

Exercise 3

Write the function (valid-values-for board coordinates) that returns a set with all valid numbers for the square at coordinates.

If the square at coordinates already has a value, valid-values should return the empty set #{}.

Remember that we already defined the set all-values.

(valid-values-for sudoku-board [0 0]) ;=> #{}
(valid-values-for sudoku-board [0 2]) ;=> #{1 2 4})

Next, let’s write a function to figure out if a sudoku board is completely filled.

Exercise 4

Write the function (filled? board) which returns true if there are no empty squares in board, and otherwise false.

Remember that (contains? set element) can be used to check if element is in set.

(filled? sudoku-board) ;=> false
(filled? solved-board) ;=> true

Write the function (find-empty-point board) that returns coordinates to an empty point (that is, in our representation has value 0).

Now that we can check if a board is full, it would be nice to know if the solution is valid.

A sudoku is valid if each row, each column and each block contains the numbers from 1 to 9 exactly once. Let’s write functions for checking each of these conditions.

To start, let’s write some functions to get the values for each row, column and block.

Exercise 5

Write the function (rows board) that returns a sequence of value sets for each row of board. That is, the first set in (rows board) is a set that has every element of the first row of board as element and so on.

(rows sudoku-board) ;=> [#{5 3 0 7}
                    ;    #{6 0 1 9 5}
                    ;    #{0 9 8 6}
                    ;    #{8 0 6 3}
                    ;    #{4 0 8 3 1}
                    ;    #{7 0 2 6}
                    ;    #{0 6 2 8}
                    ;    #{0 4 1 9 5}
                    ;    #{0 8 7 9}]
(rows solved-board) ;=> [#{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}]

Write the function (cols board) that returns the values of each column in board as a sequence of sets.

(cols sudoku-board) ;=> [#{5 6 0 8 4 7}
                    ;    #{3 0 9 6}
                    ;    #{0 8}
                    ;    #{0 1 8 4}
                    ;    #{7 9 0 6 2 1 8}
                    ;    #{0 5 3 9}
                    ;    #{0 2}
                    ;    #{0 6 8 7}
                    ;    #{0 3 1 6 5 9}]
(cols solved-board) ;=> [#{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}
                    ;    #{1 2 3 4 5 6 7 8 9}]

Write the function (blocks board) that returns the values of each block in board as a sequence of sets.

(blocks sudoku-board) ;=> [#{5 3 0 6 9 8}
                      ;    #{0 7 1 9 5}
                      ;    #{0 6}
                      ;    #{8 0 4 7}
                      ;    #{0 6 8 3 2}
                      ;    #{0 3 1 6}
                      ;    #{0 6}
                      ;    #{0 4 1 9 8}
                      ;    #{2 8 0 5 7 9}]
(blocks solved-board) ;=> [#{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}
                      ;    #{1 2 3 4 5 6 7 8 9}])

Now we can get the values used in every row, column and block. Let’s write functions that check if every row, column and block is valid as per the rules of sudoku.

Exercise 6

Write the function (valid-rows? board) that returns true if every row in board is a valid filled row.

(valid-rows? solved-board)  ;=> truthy
(valid-rows? invalid-board) ;=> falsey

Write the function (valid-cols? board) that returns true if every row in board is a valid filled column.

(valid-cols? solved-board)  ;=> truthy
(valid-cols? invalid-board) ;=> falsey

Write the function (valid-blocks? board) that returns true if every block in board is a valid filled block.

(valid-blocks? solved-board)  ;=> truthy
(valid-blocks? invalid-board) ;=> falsey

Finally, write the function (valid-solution? board) that checks if the whole board is a valid solution.

(valid-solution? solved-board)  ;=> truthy
(valid-solution? invalid-board) ;=> falsey)

Now we can verify whether or not a solution is valid. However, if we want to actually solve a sudoku, we need to be able to modify a partial solution.

Earlier we saw how useful get-in can be when indexing nested structures. Theres a similar function for changing nested structures, called assoc-in. (assoc-in nested-structure path new-value) changes the value pointed by path, which is a sequence of keys. Here’s an example:

 (assoc-in [[:a :b] [:c :d]] [1                                  0] :E)
;=> (assoc [[:a :b] [:c :d]]  1 (assoc (get [[:a :b] [:c :d]] 1) 0  :E))
;=> (assoc [[:a :b] [:c :d]]  1 (assoc               [:c :d]     0  :E))
;=> (assoc [[:a :b] [:c :d]]  1 [:E :d])
;=>        [[:a :b] [:E :d]]

Okay, so now we can find an empty location and we also know what the valid values for that location are. What’s left is to try each one of those values in that location and trying to solve the rest. This is called backtracking search. You try one choice and recurse, if the recursive call didn’t find any solutions, try the next choice. If none of the choices return a valid solution, return nil.

Let’s take a small detour and see an example of backtracking search.

Subset Sum

Subset sum is a classic problem. Here’s how it goes. You are given:

and you want to know if there is some subset of the original set that sums up to the target. We’re going to solve this by brute force using a backtracking search.

Here’s one way to implement it:

(defn sum [a-seq]
  (reduce + a-seq))

(defn subset-sum-helper [a-set current-set target]
  (if (= (sum current-set) target)
    [current-set]
    (let [remaining (clojure.set/difference a-set current-set)]
      (for [elem remaining
            solution (subset-sum-helper a-set
                                        (conj current-set elem)
                                        target)]
        solution))))

(defn subset-sum [a-set target]
  (subset-sum-helper a-set #{} target))

So the main thing happens inside subset-sum-helper. First of all, always check if we have found a valid solution. Here it’s checked with

  (if (= (sum current-set) target)
    [current-set]

If we have found a valid solution, return it in a vector (We’ll see soon why in a vector). Okay, so if we’re not done yet, what are our options? Well, we need to try adding some element of a-set into current-set and try again. What are the possible elements for this? They are those that are not yet in current-set. Those are bound to the name remaining here:

    (let [remaining (clojure.set/difference a-set current-set)]

What’s left is to actually try calling subset-sum-helper with each new set obtainable in this way:

      (for [elem remaining
            solution (subset-sum-helper a-set
                                        (conj current-set elem)
                                        target)]
        solution))))

Here first elem gets bound to the elements of remaining one at a time. For each elem, solution gets bound to each element of the recursive call

            solution (subset-sum-helper a-set
                                        (conj current-set elem)
                                        target)]

And this is the reason we returned a vector in the base case, so that we can use for in this way. Finally, we return each such solution in a sequence.

So let’s try this out:

    (subset-sum #{1 3 4 10 9 23} 20)
;=> (#{1 9 10} #{1 9 10} #{1 9 10} #{1 9 10} #{1 9 10} #{1 9 10})

Okay, so the above example is not exactly optimal. It forms each set many times. Since we were only interested in one solution, however, we can just add first to the call if the helper:

(defn subset-sum [a-set target]
  (first (subset-sum-helper a-set #{} target)))

And due to the way Clojure uses laziness, this actually cuts the computation after a solution is found (well, to be exact, after 32 solutions have been found due to the way Clojure chunks lazy sequences).

Solving Sudokus

It’s finally time to write the search for a solution to sudokus.

Exercise 7

Write the function (solve board) which takes a sudoku board as a parameter and returns a valid solution to the given sudoku.

   (solve sudoku-board) ;=> solved-board
   

Recap of backtracking:

Design and code quality

Proper indentation, naming, and coding quality are important. Performance is not a focus of this assignment. If you reach the last Exercise, I recommend that you give it 20-30 minutes with a pen and paper before actually starting to code.