LeetCode 139. Word Break (断词)

程序员文章站 2023-12-21 18:29:52

...

原题

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if scan be segmented into a space-separated sequence of one or more dictionary words.

Note:

The same word in the dictionary may be reused multiple times in the segmentation.
You may assume the dictionary does not contain duplicate words.

Example 1:

Input: s = "leetcode", wordDict = ["leet", "code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:

Input: s = "applepenapple", wordDict = ["apple", "pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
             Note that you are allowed to reuse a dictionary word.

Example 3:

Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]
Output: false

Reference Answer

思路分析

这道题其实还是一道经典的DP题目，也就是动态规划Dynamic Programming。博主曾经说玩子数组或者子字符串且求极值的题，基本就是DP没差了，虽然这道题没有求极值，但是玩子字符串也符合DP的状态转移的特点。把一个人的温暖转移到另一个人的胸膛… 咳咳，跑错片场了，那是爱情转移～强行拉回，DP解法的两大难点，定义dp数组跟找出状态转移方程，先来看dp数组的定义，这里我们就用一个一维的dp数组，其中dp[i]表示范围[0, i)内的子串是否可以拆分，注意这里dp数组的长度比s串的长度大1，是因为我们要handle空串的情况，我们初始化dp[0]为true，然后开始遍历。注意这里我们需要两个for循环来遍历，因为此时已经没有递归函数了，所以我们必须要遍历所有的子串，我们用j把[0, i)范围内的子串分为了两部分，[0, j) 和 [j, i)，其中范围 [0, j) 就是dp[j]，范围 [j, i) 就是s.substr(j, i-j)，其中dp[j]是之前的状态，我们已经算出来了，可以直接取，只需要在字典中查找s.substr(j, i-j)是否存在了，如果二者均为true，将dp[i]赋为true，并且break掉，此时就不需要再用j去分[0, i)范围了，因为[0, i)范围已经可以拆分了。最终我们返回dp数组的最后一个值，就是整个数组是否可以拆分的布尔值了，代码如下：

参考答案采用了一种很聪明的做法，将mark矩阵设置为len(mark) = len(s) + 1, 将mark[0]设置为True是为了作为分割点，而判断序列的结果则是依次放在了mark[1:len(s)+1]对应的符号上，这样同时避免了当 s = ‘a’,而dict为[“b”]，直接输出True的情形，因为返回的结果为mark[len(s)]，当两者了长度为1，却不匹配时，依旧输出为False。

Code

class Solution:
    def wordBreak(self, s, wordDict):
        """
        :type s: str
        :type wordDict: List[str]
        :rtype: bool
        """
        mark = [False for _ in range(len(s)+1)]
        mark[0] = True
        for index in range(len(s)+1):
            for inner_index in range(index):
                if mark[inner_index] and s[inner_index:index] in wordDict:
                    mark[index] = True
        return mark[len(s)]

C++ version

class Solution {
public:
    bool wordBreak(string s, vector<string>& wordDict) {
        vector<bool> mark(s.size()+1, false);
        mark[0] = true;
        for (int index=0; index<=s.size(); ++index){
            for(int inner_index=0; inner_index < index; ++inner_index){
                if (mark[inner_index] && 
                    (find(wordDict.begin(), wordDict.end(), 
                          s.substring(inner_index,index) != wordDict.end())){
                    mark[index] = true;
                }
            
        }
        return mark[s.size()];
        
    }
};

Note

玩子数组或者子字符串且求极值的题，基本就是DP没差了，虽然这道题没有求极值，但是玩子字符串也符合DP的状态转移的特点。
个人理解DP与回溯的用途主要区别主要是，回溯更侧重找路径（要求输出所有可能组合，所有遍历路径等）；而DP则更适合子数组、子字符串的极值问题（重点在于求值），如到达某点的所有路径之和，值固定的组合数之和等等。

LeetCode 139. Word Break (断词)

原题

Reference Answer

Note

LeetCode 139. Word Break (断词)

LeetCode1003.Check If Word Is Valid After Substitutions（检查替换后的词是否有效）