Of course, this solver is for the standard Sudoku game on a 9 x 9 grid. Perhaps the claim in the article refers to "Sudoku" in the sense of the generalized problem of solving n x n Sudoku, for any n?
As a general rule, when anybody refers to a problem being NP-hard or whatever, they're talking about a generalized version of the problem, with at least one "size" variable--the time complexity has to be polynomial/exponential/whatever in something.
9x9 Sudoku has a constant size, which means that it can be evaluated in O(1) time, if such notation makes sense.
Perhaps useful to add that Sudoku reduces to the more general exact-cover problem[0], which is indeed NP-hard.
Also, if you've never run across Dancing Links[1], do yourself a favor and go take a look. It's a method (by Knuth) for implementing an algorithm (ditto) for generally solving any exact-cover problem (including Sudoku of course). Basically it amounts to constructing a matrix that encompasses all of the game's constraints and possible inputs, after which all solutions can be found through mechanical operations on the matrix. It's both beautiful and very fun to implement.
Now that you mention Donald Knuth, I can't resist sharing a really interesting answer he gave recently to one question from a set of twenty, related to algorithm design and analysis [1]:
15. Robert Tarjan, Princeton: What do you see as the most promising directions for future work in algorithm design and analysis? What interesting and important open problems do you see?
Don Knuth: My current draft about satisfiability already mentions 25 research problems, most of which are not yet well known to the theory community. Hence many of them might well be answered before Volume 4B is ready. Open problems pop up everywhere and often. But your question is, of course, really intended to be much more general.
In general I'm looking for more focus on algorithms that work fast with respect to problems whose size, n, is feasible. Most of today's literature is devoted to algorithms that are asymptotically great, but they are helpful only when n exceeds the size of the universe.
In one sense such literature makes my life easier, because I don't have to discuss those methods in TAOCP. I'm emphatically not against pure research, which significantly sharpens our abilities to deal with practical problems and which is interesting in its own right. So I sometimes play asymptotic games. But I sure wouldn't mind seeing a lot more algorithms that I could also use.
For instance, I've been reading about algorithms that decide whether or not a given graph G belongs to a certain class. Is G, say, chordal? You and others discovered some great algorithms for the chordality and minimum fillin problems, early on, and an enormous number of extremely ingenious procedures have subsequently been developed for characterizing the graphs of other classes. But I've been surprised to discover that very few of these newer algorithms have actually been implemented. They exist only on paper, and often with details only sketched.
Two years ago I needed an algorithm to decide whether G is a so-called comparability graph, and was disappointed by what had been published. I believe that all of the supposedly "most efficient" algorithms for that problem are too complicated to be trustworthy, even if I had a year to implement one of them.
Thus I think the present state of research in algorithm design misunderstands the true nature of efficiency. The literature exhibits a dangerous trend in contemporary views of what deserves to be published.
Another issue, when we come down to earth, is the efficiency of algorithms on real computers. As part of the Stanford GraphBase project I implemented four algorithms to compute minimum spanning trees of graphs, one of which was the very pretty method that you developed with Cheriton and Karp. Although I was expecting your method to be the winner, because it examines much of the data only half as often as the others, it actually came out two to three times worse than Kruskal's venerable method. Part of the reason was poor cache interaction, but the main cause was a large constant factor hidden by O notation.
The solver doesn't obviously negate the puzzle's NP-completeness. It first reduces the solution space, then starts brute-forcing what's left. I don't have a proof on hand, but that just smells nonpolynomial.
You could argue that, if we consider "Sudoku" to be the instance of the puzzle where n=9, then the puzzle isn't NP-hard because it has the constant-time solution of trying all 6.67e21 solutions, but the result is so uninteresting that it's understood that the author means the more general problem.
Depends on the initial state of the puzzle. Sometimes there's always a deterministic set of correct next moves to make. I guess this refers to cases that aren't like that?
Right; we're talking about upper bounds here. Assume that the boards are chosen by some kind of wily, malevolent demon with an obsessive desire to embed tricky 3SAT problems in everything.
Similarly, sometimes your sorting algorithm takes linear time because the input was a sorted list. That's the lower bound, but lower bounds usually aren't as interesting as upper bounds.
its easy to construct a Sudoku that only has a few numbers in it, and such you have to try filling it in many times before discovering a contradiction. Puzzle authors know this is not fun so the ones used for entertainment avoid being pathological.
http://norvig.com/sudoku.html
Of course, this solver is for the standard Sudoku game on a 9 x 9 grid. Perhaps the claim in the article refers to "Sudoku" in the sense of the generalized problem of solving n x n Sudoku, for any n?