Please login first
Information Theory for Equalizing the Residue-wise Information Amounts of the Proteins and Protein-Coding DNA
1  Biophysics Department of Altinbas University Medical Faculty


Genes are certain regions of the deoxyribonucleic acid (DNA), which is the hereditary material of the biological cells.  They encode for the synthesis of proteins.  Both proteins and DNA are polymeric macromolecules that are formed up of unique combinations of their respective building blocks.  Each of these building blocks is a residue and the residue-wise information content of the DNA that is encoding a certain length of protein is assumed to be inherently related to that in the protein.  Here, it is aimed to present a method that involves information communication theory, to relate the residue-wise information contents of the DNA and protein molecules.  Through information communication theory, information content of a polymeric macromolecule can be calculated in bits, by multiplying the number of building blocks that encompasses the entire length of the macromolecule with the Shannon’s entropy of each building block.  Shannon’s entropy of each building block would be determined through the degree of variation in the number of those units, by assuming equal propensity of the presence of each type of building block, at every position along the entire length of the polymeric macromolecule in question.  If this approach is applied to a protein of specific size and the DNA that would be encoding the same length of protein, there is seemingly much lower residue-based information amount in the protein.  This decrease can be eliminated by implementation of a new parameter in the calculation and/or through a new formulization of the calculation of the information amount of proteins.

Keywords: DNA, protein, Shannon’s entropy, information amount