17.11. 单词距离 · 震朕的小宇宙

题目描述

有个内含单词的超大文本文件，给定任意两个不同的单词，找出在这个文件中这两个单词的最短距离(相隔单词数)。如果寻找过程在这个文件中会重复多次，而每次寻找的单词不同，你能对此优化吗?

示例：

1 2	输入：words = ["I","am","a","student","from","a","university","in","a","city"], word1 = "a", word2 = "student" 输出：1

提示：

words.length <= 100000

解题思路 & 源码

暴力题解

先考虑暴力题解，把所有要匹配的单词在数组中的位置都找出来，然后两两比较，找出最小的距离。

func distance(pos1, pos2 int) int {
    if pos1 > pos2 {
        return pos1 - pos2
    }

    return pos2 - pos1
}

func findClosest(words []string, word1 string, word2 string) int {
    var word1Postions []int
    var word2postions []int

    for i, w := range words {
        if w == word1 {
            word1Postions = append(word1Postions, i)
        } else if w == word2 {
            word2postions = append(word2postions, i)
        }
    }

    minDist := -1
    for _, pos1 := range word1Postions {
        for _, pos2 := range word2postions {
            if minDist == -1 || distance(pos1, pos2) < minDist {
                minDist = distance(pos1, pos2) 
            }
        }
    }

    return minDist
}

这样虽然时间复杂度为 O(n^2)，但是也能通过，并且发现其实 leetcode 对 abs 做过优化，也就是说如果修改计算距离的函数，能够提高通过率：

func abs(x int) int {
    if x > 0 {
        return x
    }

    return -x
}

func distance(pos1, pos2 int) int {
    return abs(pos2 - pos1)
}

优化

当然其实上面有一个优化的点，就是在一个遍历里完成查找和比较，这样可以降低时间复杂度。具体来说就是，遍历的时候记录每一次 word1 和 word2 的位置，然后比较距离，如果距离小于当前最小距离，就更新最小距离。

func abs(x int) int {
    if x < 0 {
        return -x
    }
    return x
}

func distance(pos1, pos2 int) int {
	return abs(pos2 - pos1)
}

func findClosest(words []string, word1 string, word2 string) int {
    word1Pos := -1
    word2Pos := -1
    minDistance := len(words)

    for i, word := range words {
        if word == word1 {
            word1Pos = i
        }
        if word == word2 {
            word2Pos = i
        }
        if word1Pos > -1 && word2Pos > -1 {
            if dist := distance(word1Pos, word2Pos); dist < minDistance {
                minDistance = dist
            }
        }
    }

    return minDistance
}

注意这里 word1Pos 和 word2Pos 的初始值是 -1，表示还没有找到对应的单词。每次遍历的时候，如果找到了对应的单词，就更新其位置，然后计算距离，如果距离小于当前最小距离，就更新最小距离。这样就能在 O(n) 的时间复杂度内完成查找和比较，效率更高。

使用 Map

其实题目中有提到，如果寻找过程在这个文件中会重复多次，而每次寻找的单词不同，你能对此优化吗?

可以考虑使用 map 来存储每个单词在数组中的位置，这样就能在 O(1) 的时间复杂度内找到对应的单词的位置。具体来说就是，遍历的时候把每个单词的位置存入 map 中，然后在查找的时候直接从 map 中获取对应的单词的位置。

错误题解

这里一开始想岔了，想法是遍历的时候，尝试找到 word1 和 word2 的位置，然后比较距离，只有距离更小的时候才更新其坐标，而最后再根据坐标计算最小距离。但是这会出现一个问题，就是如果 word1 和 word2 的位置在数组中是交错的，比如 word1 在 word2 前面，或者 word2 在 word1 前面，这样就会导致计算错误。错误代码如下：

func abs(x int) int {
    if x < 0 {
        return -x
    }
    return x
}

func distance(pos1, pos2 int) int {
	return abs(pos2 - pos1)
}

func findClosest(words []string, word1 string, word2 string) int {
	word1Postion := -1
	word2postion := -1

	for i, w := range words {
		if word1Postion == -1 || word2postion == -1 {
			if w == word1 {
				word1Postion = i
			} else if w == word2 {
				word2postion = i
			}
		} else {
			if w == word1 {
				if distance(word1Postion, word2postion) > distance(i, word2postion) {
					word1Postion = i
				}
			} else if w == word2 {
				if distance(word1Postion, word2postion) > distance(i, word1Postion) {
					word2postion = i
				}
			}
		}
	}

	return distance(word1Postion, word2postion)
}