Saturday, December 7, 2013

LeetCode OJ: Word Ladder I/II

Given two words (start and end), and a dictionary, find all shortest transformation sequence(s) from start to end, such that:
  1. Only one letter can be changed at a time
  2. Each intermediate word must exist in the dictionary
For example,
Given:
start = "hit"
end = "cog"
dict = ["hot","dot","dog","lot","log"]
Return
  [
    ["hit","hot","dot","dog","cog"],
    ["hit","hot","lot","log","cog"]
  ]
Note:
  • All words have the same length.
  • All words contain only lowercase alphabetic characters.

The above description is for Word Ladder II. Word Ladder one only require you to return the length of shortest path, which will be 5 in the above example.

The output part is basic DFS, and the key part is to find a parent map for the single source shortest path problem. Normally, when we have a graph G(V, E) and a single source s, Dijkstra's algorithm will build a parent map P: V -> V, which can be used to find the shortest path to any destination t from s. P(t) is simply the vetex v, such that path s->...->v->t is the shortest path from s to u. In Word Ladder II, since we are interested in not only one shortest path, but all shortest path, this parent map must be mapping from V to a subset of V, i.e. P: V -> V^2, as there might be more than one parent of a given vertex and we want to get all of them.

A simple modification of Dijkstra will work. Usually, when the temporary minimum distance of a vertex u is updated, we let the vertex v that updates it to be its parent. Now, we add the v to the parent set instead. When the minimum distance is decreased, we clear the original parent set and then add v; when the new distance is the same as the minimum distance, we add the v.

An accepted implementation is as follows:
public class Solution {
public static class Node {
public int dist;
public String value;
public Node(int dist, String value) {
this.dist = dist;
this.value = value;
}
}
public static class NodeComparator implements Comparator<Node> {
@Override
public int compare(Node o1, Node o2) {
return o1.dist - o2.dist;
}
}
// This is a n^2 algorithm, since we are not using heap for Dijkstra algorithm
public static ArrayList<ArrayList<String>> findLadders(String start, String end, HashSet<String> dict) {
if (!dict.contains(start)) {
dict.add(start);
}
if (!dict.contains(end)) {
dict.add(end);
}
int n = dict.size();
// Create adjacency list
HashMap<String, LinkedList<String>> adj = new HashMap<String, LinkedList<String>>(n);
for (String word: dict) {
char[] letters = word.toCharArray();
LinkedList<String> adjList = new LinkedList<String>();
for (int i = 0; i < letters.length; i++) {
char save = letters[i];
for (char replace = 'a'; replace <= 'z'; replace++) {
letters[i] = replace;
String neighbor = new String(letters);
if (replace != save && dict.contains(neighbor)) {
adjList.add(neighbor);
}
}
letters[i] = save;
}
adj.put(word, adjList);
}
// Run Dijkstra
HashMap<String, Integer> minDist = new HashMap<String, Integer>(n);
for (String word: dict) {
minDist.put(word, Integer.MAX_VALUE);
}
minDist.put(start, 0);
HashSet<String> visited = new HashSet<String>(n);
PriorityQueue<Node> pq = new PriorityQueue<Node>(n, new NodeComparator());
pq.add(new Node(0, start));
HashMap<String, LinkedList<String>> parent = new HashMap<String, LinkedList<String>>(n);
for (String word: dict) {
parent.put(word, new LinkedList<String>());
}
while (!pq.isEmpty()) {
boolean found = false;
Node cur = null;
while (!pq.isEmpty()) {
cur = pq.remove();
if (!visited.contains(cur.value) && cur.dist == minDist.get(cur.value)) {
found = true;
break;
}
}
if (!found || cur.dist == Integer.MAX_VALUE) {
break;
}
String value = cur.value;
visited.add(value);
for (String s: adj.get(value)) {
int newDist = minDist.get(value) + 1;
int oldDist = minDist.get(s);
if (newDist < oldDist) {
minDist.put(s, newDist);
parent.get(s).clear();
parent.get(s).add(value);
pq.add(new Node(newDist, s));
} else if (newDist == oldDist) {
parent.get(s).add(value);
}
}
}
// Recursively find all path
int shortest = minDist.get(end);
ArrayList<ArrayList<String>> result = new ArrayList<ArrayList<String>>();
if (shortest == Integer.MAX_VALUE) {
return new ArrayList<ArrayList<String>>();
} else {
int size = minDist.get(end) + 1;
search(result, new ArrayList<String>(shortest), end, 0, size, dict, parent);
return result;
}
}
public static void search(ArrayList<ArrayList<String>> result, ArrayList<String> current,
String cur, int step, int maxStep, HashSet<String> dict, HashMap<String, LinkedList<String>> parent) {
current.add(cur);
step++;
if (step == maxStep) {
result.add(inverseList(current));
} else {
for (String p: parent.get(cur)) {
search(result, current, p, step, maxStep, dict, parent);
}
}
step--;
current.remove(step);
}
public static ArrayList<String> inverseList(ArrayList<String> a) {
int n = a.size();
ArrayList<String> ret = new ArrayList<String>(n);
for (int i = 0; i < n; i++) {
ret.add(a.get(n - i - 1));
}
return ret;
}
}

Notice that we used the adjacency list and priority queue for optimal performance. Otherwise it can not pass the large test cases.

Suppose we have n words in the dictionary, each word is at most m characters long, then a naive way to create the adjacency list by comparing each pair of words suffers O(m*n^2) complexity, which is unacceptable for large n like 20000. In my implementation, we instead find neighbors of a word w by trying to change each of its letters to another one, and the complexity is reduces to (m*n*26).

For the PriorityQueue implementation in Java, decreaseKey operation (decrease the priority of a node given its reference, the priority queue shall rearrange to keep heap property), which is used in Dijkstra's algorithm is not supported. Rather than implement my own priority queue with this operation, I simply insert a node with the decreased key instead. In this way, there will be more than one node for the same vertex in the priority queue, one with the newest/smallest key, whereas the rest are with previous large keys. Since we can keep track of the smallest temporary distance of all the vertices, we can simply ignore a node with key larger than that when doing a removeMin operation. A simple analysis show that this will potential result in larger heap, so the complexity is increased from O(E + VlogV) to O(E + VlogE), which is essentially the same since logE is at most 2logV.

BFS can also be used for this problem since edge weights are uniformly one. Similar, one need to maintain a parent map that is one to many instead of one to one. An BFS implementation is as follows. The running time on LeetCode is not better that the Dijkstra one. Both of them takes more than 1000 ms. Maybe Java is too slow?


public class Solution1 {
public static class Node {
public int level;
public String value;
public Node(String value, int level) {
this.level = level;
this.value = value;
}
}
// This is a n^2 algorithm, since we are not using heap for Dijkstra algorithm
public static ArrayList<ArrayList<String>> findLadders(String start, String end, HashSet<String> dict) {
if (!dict.contains(start)) {
dict.add(start);
}
if (!dict.contains(end)) {
dict.add(end);
}
int n = dict.size();
// Create adjacency list
HashMap<String, LinkedList<String>> adj = new HashMap<String, LinkedList<String>>(n);
for (String word: dict) {
char[] letters = word.toCharArray();
LinkedList<String> adjList = new LinkedList<String>();
for (int i = 0; i < letters.length; i++) {
char save = letters[i];
for (char replace = 'a'; replace <= 'z'; replace++) {
letters[i] = replace;
String neighbor = new String(letters);
if (replace != save && dict.contains(neighbor)) {
adjList.add(neighbor);
}
}
letters[i] = save;
}
adj.put(word, adjList);
}
// Run BFS
HashMap<String, Integer> minDist = new HashMap<String, Integer>(n);
for (String word: dict) {
minDist.put(word, Integer.MAX_VALUE);
}
minDist.put(start, 0);
HashSet<String> visited = new HashSet<String>(n);
HashMap<String, LinkedList<String>> parent = new HashMap<String, LinkedList<String>>(n);
for (String word: dict) {
parent.put(word, new LinkedList<String>());
}
LinkedList<Node> queue = new LinkedList<Node>();
visited.add(start);
queue.addLast(new Node(start, 0));
while (!queue.isEmpty()) {
Node cur = queue.removeFirst();
String value = cur.value;
for (String s: adj.get(value)) {
int newDist = cur.level + 1;
int oldDist = minDist.get(s);
if (newDist < oldDist) {
minDist.put(s, newDist);
parent.get(s).clear();
parent.get(s).add(value);
} else if (newDist == oldDist) {
parent.get(s).add(value);
}
if (!visited.contains(s)) {
queue.add(new Node(s, newDist));
visited.add(s);
}
}
}
// Recursively find all path
int shortest = minDist.get(end);
ArrayList<ArrayList<String>> result = new ArrayList<ArrayList<String>>();
if (shortest == Integer.MAX_VALUE) {
return new ArrayList<ArrayList<String>>();
} else {
int size = minDist.get(end) + 1;
search(result, new ArrayList<String>(shortest), end, 0, size, dict, parent);
return result;
}
}
public static void search(ArrayList<ArrayList<String>> result, ArrayList<String> current,
String cur, int step, int maxStep, HashSet<String> dict, HashMap<String, LinkedList<String>> parent) {
current.add(cur);
step++;
if (step == maxStep) {
result.add(inverseList(current));
} else {
for (String p: parent.get(cur)) {
search(result, current, p, step, maxStep, dict, parent);
}
}
step--;
current.remove(step);
}
public static ArrayList<String> inverseList(ArrayList<String> a) {
int n = a.size();
ArrayList<String> ret = new ArrayList<String>(n);
for (int i = 0; i < n; i++) {
ret.add(a.get(n - i - 1));
}
return ret;
}
}

No comments:

Post a Comment