Longest substring with at most K unique characters

Objective: Given a string, write an algorithm to find the longest substring with at most K characters.

Example:

Input: aabbaacdeeeeddded, K = 3
Output: Longest substring with 3 most unique characters is: cdeeeeddded with length 11

Input: abcddefabc, K = 4
Output: Longest substring with 4 most unique characters is: abcdd with length 5

Input: aaaabbbb, K = 4
Output: Not enough unique character is present in the input string

Approach: Brute Force

Identify all the substrings of the given input string and check each substring with the number of unique characters in it and pick the one with maximum length.

Time Complexity: O(N3), N2 for all finding the substrings and O(N) for each substring while checking unique characters.

Better Solution – Linear Search 

  • Check if the string has at least K unique characters if no then prints the error message and return.
  • Let’s say we take substring between indexes curr_start and curr_end, both points to index 0 at the beginning.
  • Keep adding one character at a time to the substring by moving curr_end pointer to the right and make sure that condition of substring has at most K unique characters , is valid by adding the new character and if any point the condition is not valid then start moving curr_start pointer to the right till condition is valid again.
  • At each step, keep track of the length of a maximum substring by doing curr_end-curr_start.  
  • To check whether substring has only K unique characters, maintain an array of size 26 ( 26 alphabets) and the array is filled with the count of characters in the string to the respective index. 0th index for ‘a’ and 25th index for z.
  • See the example below for more understanding.

Time Complexity: O(N)

Input: abacdedd, K = 3
Processing character:a
SubString: a
a:1, 
curr_start: 0, curr_end: 1, max_Start: 0, max_Length: 1
-----------------------------------------------
Processing character:b
SubString: ab
a:1, b:1, 
curr_start: 0, curr_end: 2, max_Start: 0, max_Length: 2
-----------------------------------------------
Processing character:a
SubString: aba
a:2, b:1, 
curr_start: 0, curr_end: 3, max_Start: 0, max_Length: 3
-----------------------------------------------
Processing character:c
SubString: abac
a:2, b:1, c:1, 
curr_start: 0, curr_end: 4, max_Start: 0, max_Length: 4
-----------------------------------------------
Processing character:d
SubString: abacd
a:2, b:1, c:1, d:1, 
More than 3 unique characters in substring, moving curr_start to right
curr_start: 1, curr_end: 5, max_Start: 0, max_Length: 4
a:1, b:1, c:1, d:1, 
More than 3 unique characters in substring, moving curr_start to right
curr_start: 2, curr_end: 5, max_Start: 0, max_Length: 4
a:1, c:1, d:1, 
curr_start: 2, curr_end: 5, max_Start: 0, max_Length: 4
-----------------------------------------------
Processing character:e
SubString: acde
a:1, c:1, d:1, e:1, 
More than 3 unique characters in substring, moving curr_start to right
curr_start: 3, curr_end: 6, max_Start: 0, max_Length: 4
c:1, d:1, e:1, 
curr_start: 3, curr_end: 6, max_Start: 0, max_Length: 4
-----------------------------------------------
Processing character:d
SubString: cded
c:1, d:2, e:1, 
curr_start: 3, curr_end: 7, max_Start: 0, max_Length: 4
-----------------------------------------------
Processing character:d
SubString: cdedd
c:1, d:3, e:1, 
curr_start: 3, curr_end: 8, max_Start: 3, max_Length: 5
Longest substring with 3 most unique characters is : cdedd with length 5

Complete Code:

Output:

Input: aabbaacdeeeeddded
Longest substring with 4 most unique characters is : aacdeeeeddded with length 13

__________________________________________________
Top Companies Interview Questions..-

Google Microsoft Amazon Facebook more..

If you find anything incorrect or you feel that there is any better approach to solve the above problem, please write comment.
__________________________________________________

You may also like...

%d bloggers like this: