Expected value of the amount of information delivered by a message
TheExpected averagevalue of the amount of information produceddelivered by a stochastic source of datamessage
Information entropy is defined as the average amount of information produced by a stochastic source of data.
The measure of information entropy associated with each possible data value is the negative logarithm of the probability mass function for the value. Thus, when the data source has a lower-probability value (i.e., when a low-probability event occurs), the event carries more "information" ("surprisal") than when the source data has a higher-probability value. The amount of information conveyed by each event defined in this way becomes a random variable whose expected value is the information entropy. Generally, entropy refers to disorder or uncertainty, and the definition of entropy used in information theory is directly analogous to the definition used in statistical thermodynamics. The concept of information entropy was introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication".
The basic model of a data communication system is composed of three elements, a source of data, a communication channel, and a receiver, and – as expressed by Shannon – the "fundamental problem of communication" is for the receiver to be able to identify what data was generated by the source, based on the signal it receives through the channel. The entropy provides an absolute limit on the shortest possible average length of a lossless compression encoding of the data produced by a source, and if the entropy of the source is less than the channel capacity of the communication channel, the data generated by the source can be reliably communicated to the receiver (at least in theory, possibly neglecting some practical considerations such as the complexity of the system needed to convey the data and the amount of time it may take for the data to be conveyed).
Information entropy is typically measured in bits (alternatively called "shannons") or sometimes in "natural units" (nats) or decimal digits (called "dits", "bans", or "hartleys"). The unit of the measurement depends on the base of the logarithm that is used to define the entropy.
Shannon's theorem also implies that no lossless compression scheme can shorten all messages. If some messages come out shorter, at least one must come out longer due to the pigeonhole principle. In practical use, this is generally not a problem, because we are usually only interested in compressing certain types of messages, for example English documents as opposed to gibberish text, or digital photographs rather than noise, and it is unimportant if a compression algorithm makes some unlikely or uninteresting sequences larger. However, the problem can still arise even in everyday use when applying a compression algorithm to already compressed data: for example, making a ZIP file of music, pictures or videos that are already in a compressed format such as FLAC, MP3, WebM, AAC, PNG or JPEG will generally result in a ZIP file that is slightly larger than the source file(s).
Expected value ofThe theaverage amount of information deliveredproduced by a messagestochastic source of data
Information entropy is defined as the average amount of information produced by a stochastic source of data.
The measure of information entropy associated with each possible data value is the negative logarithm of the probability mass function for the value. Thus, when the data source has a lower-probability value (i.e., when a low-probability event occurs), the event carries more "information" ("surprisal") than when the source data has a higher-probability value. The amount of information conveyed by each event defined in this way becomes a random variable whose expected value is the information entropy. Generally, entropy refers to disorder or uncertainty, and the definition of entropy used in information theory is directly analogous to the definition used in statistical thermodynamics. The concept of information entropy was introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication".
The basic model of a data communication system is composed of three elements, a source of data, a communication channel, and a receiver, and – as expressed by Shannon – the "fundamental problem of communication" is for the receiver to be able to identify what data was generated by the source, based on the signal it receives through the channel. The entropy provides an absolute limit on the shortest possible average length of a lossless compression encoding of the data produced by a source, and if the entropy of the source is less than the channel capacity of the communication channel, the data generated by the source can be reliably communicated to the receiver (at least in theory, possibly neglecting some practical considerations such as the complexity of the system needed to convey the data and the amount of time it may take for the data to be conveyed).
Information entropy is typically measured in bits (alternatively called "shannons") or sometimes in "natural units" (nats) or decimal digits (called "dits", "bans", or "hartleys"). The unit of the measurement depends on the base of the logarithm that is used to define the entropy.
Shannon's theorem also implies that no lossless compression scheme can shorten all messages. If some messages come out shorter, at least one must come out longer due to the pigeonhole principle. In practical use, this is generally not a problem, because we are usually only interested in compressing certain types of messages, for example English documents as opposed to gibberish text, or digital photographs rather than noise, and it is unimportant if a compression algorithm makes some unlikely or uninteresting sequences larger. However, the problem can still arise even in everyday use when applying a compression algorithm to already compressed data: for example, making a ZIP file of music, pictures or videos that are already in a compressed format such as FLAC, MP3, WebM, AAC, PNG or JPEG will generally result in a ZIP file that is slightly larger than the source file(s).
Expected value of the amount of information delivered by a message