Nuh Aydin
Normally, we associate the idea of redundancy with such concepts as wastefulness, uselessness repetition, and superfluity. However, there are many instances where redundancy can actually be very useful. One of the prime examples is language. In fact, in linguistics redundancy is considered to be a crucial feature, not a deficiency, of a language. The main use of redundancy is to increase the possibility of the receiver (listener, or reader) recovering the original message when the message received is not the same as the message sent due to such factors as noise, lack of clarity, ambiguity, hearing difficulty, and so forth. We employ redundancy in a natural way in learning and processing language, without even noticing it. It turns out that the basic principles that we use in human language can also be applied in the precise language of mathematics to deal with errors caused by noise or other external factors and introduced to digital messages during transmission. We will explain these ideas in more detail in the rest of the article.
Use of redundancy in human communication
We make use of redundancy that is present in human language to correct errors. This happens in both oral and written communication. For example, if you read the sentence “There is a miscake in this sentence,” you can tell that something is wrong. So we can detect an error. Moreover, we can even correct it. We are achieving two things here: error detection and error correction. What are the principles that we are using to achieve these goals? First, because the string “miscake” is not a valid word in English, we know that there is an error. Here, the redundancy manifests itself in the form of the fact that not every possible string is a valid word in the language. In a sense, some strings are wasted: potentially they could have been used as words of a language but they are not. The benefit of this “wastefulness” is that it lets us detect or correct errors in communication. Secondly, the word “miscake” is closest to the valid word “mistake” in the language, so we conclude that it is the most likely word intended. Of course, we can also use the context and meaning to detect and correct errors but that is an additional feature, not available to computers. If you enter the string “mistaky” to Merriam-Webster online dictionary, 1 it cannot find an entry for it; however, it comes up with a list of suggested words, first of which is “mistake.” So the computer is telling us that “mistake” is the most likely word intended because it is closest to the given string. This is called the maximum likelihood principle. As I type this article on my computer I witness many instances of this principle used by my word processor. For instance, when I mistakenly typed “fisrt” it automatically corrected it to “first.”
There are also other ways redundancy is used in natural languages. As already pointed out above, redundancy in context often enables us to detect and correct errors, vagueness and ambiguities. When humans communicate, redundancy, either explicitly introduced by the speaker or author or built into the language, comes into play to help the audience understand the message better and to overcome such obstacles as noise, accent, hearing difficulties, and so on. Shetter [4] gives a number of examples in which redundancy is manifest and useful in languages. We include a few interesting examples from his article here.
1. If we strike out all the vowels in a sentence, “xt slxws yxx dxwn bxt thx sxntxncx xs stxll lxgxblx, xsn’t xt”? (Can you read the part in quotes?) Since the consonants seem to be giving us most of the information we need, there must be a lot of redundancy here too.
2. The sentence “These three dogs are retrievers” shows grammatical redundancy in forms: plurality is expressed multiple times. Examples in other languages are just as easy to find, for instance, obligatory gender agreement in a language such as Spanish: La unica otra senora venezolana “The only other Venezuelan lady.”
3. A language’s stock of words (called the lexicon) shows a lot of redundant overlapping. To be convinced of this, all you have to do is to grab a thesaurus and look up a few words (big, little, fat, to die) that have lots of near-synonyms with only small stylistic differences.
4. Even the way languages are written is highly redundant. Try another experiment: take a piece of paper and cover up the LOWER HALF of all the letters in any sentence you have not read yet. If it is not significantly harder to read, that means that a lot in the shapes of the letters is redundant (could you still manage to read with 2/3 covered?).
Mathematical use of redundancy in digital communication
As we see, redundancy is present and useful in human languages in a number of different ways. Engineers have considered the question of whether computers can use some of the same principles to achieve error detection and correction in digital communication. Since computers have very limited capabilities compared to humans, for example they cannot make sense of words, it is the method of explicitly adding redundancy to original messages (as opposed to using the context) that can be used to achieve this goal in computers using the precise language of mathematics.
To illustrate the use of redundancy in a mathematical way in digital communication systems, consider the following example. Suppose we want to communicate with another party in a simple manner: sending messages that represent Yes or No, Let us agree that a 1 represents Yes and a 0 (zero) represents No. Unfortunately, there is often noise in the communication channel which may distort messages by flipping the binary bit (a 0, or a 1). If we just send the messages as they are, do we have any way of knowing if an error occurred during the transmission? Note that the reason we can do nothing against errors is that all possible strings (that all have length 1 in this simple example) are valid codewords. Codewords in digital communication correspond to valid words in a language. Compare this to the earlier example about correcting the typo in the word “miscake.”
Data of any kind is stored and processed as binary strings, that is strings of 0s and 1s, in computers. Every letter has an ASCII code. For example, the ASCII code of the letter “A” is 01000001. Typically, data consists of billions of bits. A bit is a 0 or a 1. To employ redundancy, data is broken into blocks of a fixed length. We now consider and compare several encoding schemes where the block size is 4.
Scheme 1: Perhaps most intuitive way of adding redundancy is simply to repeat the original message. Instead of sending 1011, we send 10111011. Here 1011 is the original message and 10111011 is the codeword. The string obtained after adding redundancy is called a codeword. What does this scheme buy us? Do we get any error detection or correction capability? If you think about this for a moment, you can see that if there is a single error, then it can be detected. We simply break the received word in half, and compare the two halves. If there is exactly one error, the two halves will not be the same. We also note, however, that we cannot correct any errors. Also, if the number of errors is 2 (or even) we may not be able to detect that, depending on the location of the error.
To quantify what we gain by employing an encoding scheme, let us assume that the probability of a bit error for a channel is 0.001, and there are about 3000 bits on a page. If we do not employ any encoding scheme, we expect to have an average of 3 words in error per page. If we employ this scheme though, there must be at least 2 errors per word in order for an error to go unnoticed. This improves the expected number of incorrect words to 1 in about 50 pages. Can we do better?
Scheme 2: This scheme repeats everything 3 times. So the original message 1011 is encoded as 101110111011. What are the pros and cons of this scheme? It is not hard to see that not only can we detect single or double errors; we can also correct single errors by using the “majority opinion.” This improvement comes with a cost though: only 1 out of 3 bits sent are information bits (so 2 out of 3 are redundancy bits). We say that the rate of this code is 1/3. The rate of the previous code was 1/2. With this improved error correction capacity, the expected number of incorrect words is 1 in about 6250 pages.
Scheme 3: This is a well-known and commonly used encoding scheme that adds a single parity check bit at the end so that the number of 1’s in the resulting codeword is even. Therefore, the original information 1011 is encoded as the codeword 10111. Another way of describing this method is that the modulo 2 sum of all bits (including the redundancy bit) is 0. In modulo 2 arithmetic 1+1=0. It is easy to see that this scheme detects any single errors, but cannot correct any.
Scheme 4: This is also a well-known example of an error correcting code that was one of the earliest codes designed. It was discovered by R. Hamming [1]. In this scheme 3 bits of redundancy are added to the information bits. The first redundancy bit, or the fifth bit of the codeword, is the sum of the first, second, and fourth bits. The next redundancy bits are the sum of the first, third, and fourth bits. The last bit is the sum of the second, third and fourth bits. All sums are modulo 2. According to this scheme, the information bit 1011 is encoded as 1011010. Although it is not obvious, this code can correct any single error. Therefore, compared to the second scheme above, the Hamming code achieves the same error correction ability in a more efficient way: The information rates are 1/3 vs. 4/7.
Although codes used in practice are longer and more sophisticated, the basic principles are the same. These examples show that there are different ways of employing redundancy, some more efficient than others. The question is, therefore, not whether or not redundancy can be useful but how best to use it. Error correcting codes are used in a wide range of communication systems from deep space communication, to quality of sound in compact disks and wireless phones. Researchers are still looking for more efficient codes to make use of redundancy in more clever and useful ways. It is remarkable and surprising that a lot of theoretical mathematics can be used in the design of good codes. Some seemingly useless and abstract parts of mathematics are being used in very practical applications.
Other examples of “redundancy”
We have looked at the use of redundancy mainly in communication systems. But there are apparent redundancies in other places as well. For instance, the so called “vestigial organs” in humans and other living beings are an interesting topic of controversy. Initially, these organs were thought to be useless and non-functional. However, some functions of these organs have since been discovered. The German Anatomist R. Wiedersheim made a list of vestigial organs in 1895 which included approximately 100 organs, including the appendix and coccyx. As science progressed, it was discovered that all of the organs in Wiedersheim’s list in fact had very important functions. For instance, it was discovered that the appendix, which was supposed to be a "vestigial organ," was in fact a lymphoid organ that fought infections in the body. This fact was made clear in 1997: 2
Other bodily organs and tissues-the thymus, liver, spleen, appendix, bone marrow, and small collections of lymphatic tissue such as the tonsils in the throat and Peyer’s patch in the small intestine-are also part of the lymphatic system. They too help the body fight infection. 3
It was also discovered that the tonsils, which were included in the same list of vestigial organs, had a significant role in protecting the throat against infections, particularly until adolescence. It was found that the coccyx at the lower end of the vertebral column supports the bones around the pelvis and is the convergence point of some small muscles and for this reason, it would not be possible to sit comfortably without a coccyx. 4
Another important example we would like to consider is repetitions in the Qur’an, the Muslim holy book. There are several historical events or divine decrees and commands that are repeated in many places in the Qur’an. Some have criticized this as redundant. However, this is a superficial view. The Qur’an is the word of the All-Wise Creator, who has wisdom in everything He does. So, there must be some wisdom behind these repetitions. Seventh-century Arabs were very skilled in literature and poetry. The literary masters of Arabic admitted and appreciated the miraculous eloquence and literary power of the Qur’an. The Qur’an challenged them to make something similar to it:
And if you are in doubt about what We have revealed to our servant, then produce a sura (chapter) like it. (Baqara 2:23)
They have since been unable to meet the challenge. They used to hold literary competitions where the best poems were chosen and exhibited on the walls of the Ka‘ba, and called the Seven Hanging Poems. The Qur’an demonstrated such eloquence that it caused Labid’s daughter to remove the poems from the walls of the Ka‘ba. She declared while doing so, “Besides the verses of the Qur’an these no longer have any value” [3]. When a Beduoin poet heard verses from the Qur’an, he bowed down in prostration before its eloquence despite the fact that he did not convert to Islam. All of this should make us search for the reasons and wisdom behind the repetitions in the miraculous divine book. Nursi gives a number of such reasons in The Words [3]. He says that since the Qur’an is a book of invocation, prayer and summons, the repetition is desirable, even necessary. Also, it speaks of such mighty matters of extraordinary importance that their repetitions are most appropriate. Two examples of verses that are repeated many times in the Qur’an are Which of the favors of your Lord will you deny? (55:13, repeated thirty times in Sura al-Rahman) and Woe on that day to the deniers (77:15, repeated ten times in Sura al-Mursalat). These verses proclaim before Earth, the heavens, the ages, and in the face of humanity and jinn, their ingratitude, unbelief, and wrongdoing. They also proclaim their violation of the rights of all creatures, which brings the heavens and Earth to rage, spoil the results of the universe’s creation, and indicate contempt and denial of Divine Sovereignty’s majesty. If these two verses were repeated thousands of times, in a universal teaching related to thousands of issues, a need for them still would remain. It would be conciseness in majesty and miraculousness of eloquence in grace and beauty [3]. For a more detailed account of the reasons behind repetitions in the Qur’an, we refer the reader to Nursi’s The Words [3].
Conclusion
We have seen many examples where redundancy is very useful. We have seen redundancy is inherently built into the natural languages we speak, and it serves a purpose. Inspired by this fact, we introduce redundancy explicitly into digital communication systems when we want to be able to correct errors caused by noise. We have seen other examples where what appears to be redundant or unnecessary at a first glance really serves a purpose, and hence is not really redundant. We have seen that there are repetitions in the Qur’an but they too serve a purpose. The Qur’an and the universe reflect each other. We see apparent redundancies in both, but in the end we understand that there is a purpose behind everything that may initially appear to be redundant; hence, we cannot really find anything in the universe that is truly redundant. Therefore, we should keep in mind that apparently redundant or useless things in the universe may have hidden treasures behind them. Given that the creator is All-Wise and has wisdom in everything He does, it is our duty to go beyond the surface and seek that wisdom.
Nuh Aydin is an associate professor of Mathematics at Kenyon College, in Ohio, USA.
Notes
- http://www.m-w.com
- http://www.darwinismrefuted.com/embryology_02.html#313.
- The Merck Manual of Medical Information, Home edition, Merck & Co., Inc. The Merck Publishing Group, Rahway, New Jersey, 1997.
- http://www.darwinismrefuted.com/embryology_02.html#313.
References
Richard W. Hamming, 1950. “Error-detecting and error-correcting codes”. Bell System Technical Journal. 29: 147-160
R. Pinch, “Coding theory: the first 50 years” http://pass.maths.org/issue3/codes/
Bediuzzaman Said Nursi, The Words, Sozler, 1992.
William Z. Shetter, “This essay is redundant”
http://mypage.iu.edu/~shetter/miniatures/redund.htm