# Bits, Bytes, and Binary

(Last Mod: 27 November 2010 21:37:41 )

## Objectives

• Understand what a bit is.
• Understand what a byte is.
• Understand what bitwise operations are.
• Understand how unsigned integers are encoded in binary.

## Bits

In the base-10 numbering system nearly everyone is familiar with, the word "digit" refers to one of the ten symbols used to encode values into a written representation. The number 3784, for instance, consists of four digits. Each one of the digits can be any of the ten symbols, 0 through 9, that we use in this system. We refer to this set of ten digits as "decimal digits" because as we go from one to the next, say a 4 to a 5, the value changes by one-tenth (hence "deci-") of the number base.

If we were to use a different number base, the word "digit" would still have the same meaning. In the case of base-2, we would have just two symbols, 0 and 1, to choose from for each digit. We refer to this as the set of "binary digits". Instead of having to constantly use the phrase "binary digit", it has been shortened to simply "bit" for convenience.

A bit always refers to a single digit that can take on exactly one of two possible states. Depending on the context, we might refer to these states using a number of different labels. Some of the more common ones are:

{0, 1}, {OFF, ON}, (LO, HI), {FALSE, TRUE}, {CLEARED, SET}

Generally the first label in each group is stored the same way as the first label in any other group. The same goes for the second label in each group. In other words, if we store a LO as 0V on a capacitor and HI as 5V on a capacitor, then 5V would also be interpreted as being a 1, ON, TRUE, and SET, depending on which pair of labels made the most sense in a given context. This is NOT engraved in stone and playing games with the polarity of different signals is one of the common ways that digital designers use to obtain higher performance from a particular system, although usually at the expense of clarity.

## Digital Systems

A bit can be used to encode whether a certain piece of information is in one of two possible states. It might tell us whether a light is on or off. It might tell us whether the water level in a tank is rising or not. It might tell us whether the pressure in a line is within safe limits or not. While useful, a single bit is almost never sufficient to provide all of the information that we need. Therefore we use many bits in combination to build up what is known as a "digital system". A digital system is simply a system where the signals we are interested in are encoded into discrete digits - usually binary digits or bits. The counterpart to a digital system is an analog system where the signals are encoded by some parameter, such as the voltage on a wire, that can take one any value within a continuous range.

Modern digital systems, of which a computer is just one example, are nothing more than machines that combine many, many bits of information along with some mechanism of "reading" what the state of each of those bits presently is and of "writing" to each of those bits so as to force it to take on one of the possible values that it can represent.

## Bytes

So far we have dealt only with individual bits. Although we could build up a digital system strictly working with individual bits - and many systems, even some complex systems, are constructed just this way - we often find it much easier to work with collections of several bits as a single piece of data. The usual motivation for doing so is that we want to represent information that cannot be encoded adequately using just two states. By using multiple bits, the number of states that can be encoded grows exponentially. To be more specific, if we have N bits, then we can encode 2N different states.

The term "byte" was coined by Dr. Werner Buchholz from IBM in 1956 at a time when the need to standardize computer character sets was becoming widely recognized. He had intended it to refer to a data group consisting of exactly eight bits. He apparently used the common everyday meaning of "bit" as meaning a "small bite" and replaced the 'i' with a 'y' to prevent "bite" from being mistaken too easily for "bit". While it certainly caught on, "byte" quickly became the generic term used for the "atomic" data size of a processor - atomic meaning the smallest addressable amount of memory. Many systems therefore existed that have bytes that were anywhere from six to nine bits wide. Today, almost all machines use an eight-bit byte and, unless indicated otherwise, all of the material in this course is to be interpreted as referring to eight bit bytes.

With eight bits, a byte can take on 28, or 256, distinct states. Although it is completely up to us to determine what each one of those unique states represents, there are a couple of highly standardized mappings - integers and characters.

## Bitwise Operations

In most digital systems, computers in particular, we can not directly access information at the bit level - the hardware itself is not capable of it. Instead, in order to work with the individual bits within a larger packet of bits (a byte or larger) we work with the value as a whole and employ techniques that permit us to have the effect of reading and writing individual bits without the results being affected by the states of the other bits and, just as importantly, without affecting the states of those other bits. These are called "bitwise operations", also known more affectionately as "bit banging".

## Pure Binary Integers

Since computers are used extensively for processing numeric data, it is only reasonable that one of the mappings for a collection of bits should represent numeric values. As will be explored in later lessons, there are many different ways to map numeric values onto a set of bits. For now, we will limit the discussion to non-negative integers using a mapping known as "pure binary". This is also often referred to as "straight binary" or "unsigned binary".

Sticking to our 8-bit byte, with 256 discrete values available to us, we will use them to represent the integers 0 through 255. The mapping we will use will follow the same rules as the base-10 numbering system we use everyday. As with the other ways to represent numeric values in binary, we will explore the rules and properties of positional numbering systems in much greater depth in later lessons.

 Bit # 7 6 5 4 3 2 1 0 Label b7 b6 b5 b4 b3 b2 b1 b0 Weighting 27 26 25 24 23 22 21 20 Decimal 128 64 32 16 8 4 2 1

Notice that we start the labeling with 0 instead of 1. This is almost universal in computer science and, for most applications, is quite convenient. Notice that the bit number, the bit label, and the exponent to which the base (2) is raised are all the same because of this convention.

### Example: What is the integer represented by the bit pattern 10110101?

 Bit # 7 6 5 4 3 2 1 0 Label b7 b6 b5 b4 b3 b2 b1 b0 Weighting 27 26 25 24 23 22 21 20 Decimal 128 64 32 16 8 4 2 1 10110101 = 1 0 1 1 0 1 0 1 Value 128 0 32 16 0 4 0 1

The value of each bit it obtained by multiplying the bit by the weighting for that bit, as shown in the above table in the "Value" row. The value for the byte as a whole is then simply the sum of the values for all the bits. In this case it turns out to be:

Value = 1011 0101 b = (1 x 128) + (0 x 64) + (1 x 32) + (1 x 16) + (0 x 8) + (1 x 4) + (0 x 2) + (1 x 1) = 181

Notice that we separated the binary representation into groups of four bits. This is for readability, just as we frequently separate large decimal values into groups of three digits separated by commas. In order to prevent confusion, we also generally indicate when a number is not expressed in decimal form unless the context makes if very clear that this is the case. For binary representations, we commonly follow the value with a 'b'. If subscripted numbers are available to us, then we might also us a subscripted '2' after the value.

### Example: What is the bit pattern that represents the integer 217?

Fast and efficient ways of performing number base conversions will be a topic of a later lesson. For now, we will use the brute force method as it is adequate for our present needs. We will simply start with the largest weighting and, if it is less that the value we are trying to represent, we will set that bit and subtract that weighting from our value. If it is larger than our value, we clear that bit and do nothing to our value. We then proceed to the next bit using the remaining part of our original value and follow this same procedure until we get to the smallest weighting. Using this, we find that the number from our example would be represented as:

 Bit # 7 6 5 4 3 2 1 0 Label b7 b6 b5 b4 b3 b2 b1 b0 Weighting 27 26 25 24 23 22 21 20 Decimal 128 64 32 16 8 4 2 1 217 = 128 64 0 16 8 0 0 1 Result 1 1 0 1 1 0 0 1

Value = 217 = 1101 1001 b