# Text Processing

## Count Word Frequency in Stream

Introduction Continuing our discussions on Tries, here is another practical problem which is used in many text processing scenarios. Counting word frequency in stream translates to a scenario where you are given a list of words (dictionary). Now, there is an infinite stream of characters coming in. Your task is to print the frequency of words appearing in the stream. To know more about the basics of building a Trie, please visit my previous post Example Problem for counting word frequency in stream Let us say we have a dictionary which contains the following words aca cat hell...

## Building an autocomplete system using Trie

Introduction Few days back I wrote a post about the Trie data structure. This post is a continuation to it, and intends to introduce some practical problems which Trie solves. An autocomplete system is a perfect use case. Many of the websites use this feature to help there users with suggestions & autocomplete. Also, in this post we will define a Trie interface and then work through this problem statement. An autocomplete system using Trie As we know from the previous text that a Trie is a tree like data structure which stores words such that the search for a word is proportional to ...

## Huffman Decoding

Introduction Huffman coding is an encoding mechanism by which a variable length code word is assigned to each fixed length input character that is purely based on their frequency of occurrence of the character in the text to be encoded. It is common to assign shorter codes to more frequent characters and the less frequent characters are assigned longer code words. A code word is a binary string (a sequence of zeroes and ones). A Huffman tree is made for an input string and characters are decoded based on their position in the tree. The decoding process is as follows: We start from t...

## Tries for Text Processing

Introduction Now that we have talked so much about efficient ways of pattern finding using Boyer Moore and KMP algorithm. Let us try and understand other aspects of Text Processing as well. This post will mostly focus on using tries for text processing, storing text efficiently and also discuss about data structures which may make more sense when we talk about storage and retrieval of text based on some given criteria. Here are some real world problems which often demand quicker answers: Ever wondered why you are able to search a word in a dictionary so fast? How do you think one w...