Erik Eidt

Assembly Language, Machine Code, Processor Architecture

Bit Twiddling: Understanding Bit Operations

Published Apr 19, 2018Last updated Jun 30, 2019

This post explains bit operations and terms: And, Or, Xor, Complement, Shifting & Masking, Unsigned & Signed data types, sign extension, and more!

Review of using Binary for Numeric Representation
Introduction
Field Extraction
Masking Operations
Shifting Operations
Dynamic Mask Generation
Other Operations

Review of using Binary for Numeric Representation

Binary is a way of repreesenting numbers using strings of binary digits. This is how the computer interally stores things. When we look at a string of bits in binary, we can use them to represent signed or unsigned numbers.

Powers of 2

Binary is based on base 2 rather that our more familiar base 10. Just is in base 10 each digit put on the left of a number represent another position of the power of the base. So, in base 10, 567 means 5x10^2 + 6x10^1 + 7x10^0. In binary 101 means 1x2^2 + 0x2^1 + 1x2^0 or in base 10: 5.

There is a relationship about numbers in binary, bit positions, and powers of 2.

Power	Raised Value(10)	Raised value(2)	Log	# Values
0	2^0 = 1	000001	log2(1) = 0	2
1	2^1 = 2	000010	log2(2) = 1	4
2	2^2 = 4	000100	log2(4) = 2	8
3	2^3 = 8	001000	log2(8) = 3	16
4	2^4 = 16	010000	log2(16) = 4	32
5	2^5 = 32	100000	log2(32) = 5	64

"# Values" means the number of different values that can be represented in "Raised Value" number of bits. If we have one bit, we can store 2 different values; if we have two bits we can store 4 different values.

Signed & Unsigned

Variables hold values. We associate our variables with data types.
The data type tells us the range of possible values, and, also gives the interpretation for all the bit patterns.

The terms signed and unsigned are properties of data types. For example, a variable of type unsigned byte can hold 8 bits of data, ranging from 0 to 255 in value. A variable of type signed byte can hold 8 bits of data, ranging from -128 to 127 in value.

Positive and negative (and zero) are (mutually exclusive) properties of the values of signed data types. Unsigned data types will not admit negative numbers, so they can only be zero or positive.

Inappropriate treatment of variable & storage is a common problem in programming, especially in assembly language. In assembly language, the programmer determines how to interpret data at each and every instruction that accesses data. The hardware generally doesn't remember what data type was last used in what memory location or CPU register.

Accidentally treating some data, for example, as signed when it isn't or vice versa is a very real problem. High-level languages seek to mitigate these problems by using type systems, which detect problems of mismatch either at compile time or at runtime but before the program does something really bad.

In assembly language, however, we don't have these type systems, and sometimes these problems go undetected until values toward the edges of the range of representation are used.

Unsigned Representation

Unsigned representation allows all bits of a number, register, or field to be used for magnitude, and hence, negative numbers cannot be represented. Let's take a look at all possible 3-bit numbers, for example:

3-bit value      in decimal as unsigned
    000              0
    001              1
    010              2
    011              3
    100              4
    101              5
    110              6
    111              7

Signed Representation

All modern computers use what we call "2's complement" for storing signed integers. 2's complement representation uses numbers with the high bit set (or clear) to indicate negative (non-negative) numbers. Here's the same chart for a signed 3-bit value:

3-bit value      in decimal as signed (two's complement)
    100             -4
    101             -3
    110             -2
    111             -1
    000              0
    001              1
    010              2
    011              3

Note the high bit is sometimes called the most significant bit or MSB.

2's complement has the following qualities:

Negation is performed in 2's complement by inverting all the bits, and then adding one.
The range allows for negative numbers that are 1 past the maximum positive, so above: -4 to 3, or for 8-bit bytes: -128 to 127, for 16-bit words: -32768 to 32767.
- That means that negation has a non-zero chance of overflow (and if it happens the value stays the same!).
The same hardware circuitry can be used for addition, subtraction, multiplication, and division regardless of whether the operands are signed or unsigned, which made it heavily favored when circuit capacity was limited (as with initial microprocessors).
- The advent of the computer chip (integrated circuit), while a huge step forward in speed and other aspects actually placed a new (but constantly growing) hard limit on the amount of logic that could be provided on chip, whereas before cost was the main limiting factor in the amount of logic provided as individual transistors & diodes. Reusing circuitry was one way to fit within limits of new hardware technology.
Detection of overflow for these arithmetic operations is different for signed vs. unsigned. For example,
- For unsigned addition, a carry out of the most significant bit indicates overflow.
- For signed addition, if both inputs have the same sign, and the result doesn't, it is overflow. All other conditions either cannot overflow or indicate no overflow.

We can compare with 1's complement, which was used in the 50's and 60's before computer makers settled on 2's complement. 1's complement had the following qualities:

The same range of negative numbers as positive numbers.
Both a positive zero, and negative zero.
Some computers automatically converted negative 0 to positive 0 after each arithmetic operation
Negation is just the complement (no subsequent increment required).
Different hardware is required for signed vs. unsigned for all arithmetic operations

Introduction

Bit manipulation operations operate on strings of binary bits, each bit of which is either 0's or 1's.

However, many bit operations, in particular the ones I'm showing in this article, involve 2 inputs, namely one that is a variable and another that is a constant. Of the two operands, the constant is referred to as a mask, and, performing bit manipulation operations using such a constant mask is referred to as masking. The value of the mask has to do with both the operation we are using and the desired result we want. In the following diagrams, the variable will be represented as a string of letters, one for each bit. This will help you see where the variable's bits wind up after the operation.

For example, with a variable of 3-bits whose value is "abc" — when shifted left by 1 bit — results in (here as a 3-bit result) "bc0": a zero bit is shifted into the low bit position, while the previous bits move left by one position, and the most significant bit is lost.

We'll use an 8-bit byte as an input to show how the various bit operations produce their outputs. Using this notation we will be able to see what is happening to the various bits.

A number of the bit manipulation operators are binary operators, meaning they consume two sources, and produce one result. These operators are: And, Or, Xor, along with a Left-shift, and two Right-shifts. (Some instruction sets and programming languages also include rotate instructions in addition to the shifts.)

Field Extraction

Lets turn to an example. Let's say that we have a sensor that delivers some bit values as readings, and, that these bit values are packed into a single byte memory location. For illustrative purposes, let's say there is a one 1-bit value, one 4-bit value, and one 3-bit value. So, our memory location has a byte that is packed as follows:

76543210     bit position (in decimal)

abcdefgh     bit variables for illustration

where "a" represents the 1-bit field, "bcde" represents the 4-bit field, and "fgh" represents the 3-bit field. For the following examples, we'll concern ourselves with the 4-bit field: "bcde". In order to treat this 4-bit field like any other number, we need to isolate the wanted bits of the byte, and then, right justify it. Our intent is to go from "abcdefgh", a packed byte, to "0000bcde": this field, now unpacked from the other bits, and right justified in the byte. Typically, we'll first read the packed byte into a CPU register and then unpack it there. We are presuming an 8-bit wide CPU register for simplicity; however, other CPU register sizes can be extrapolated from this.

Note that for now, we're treating the 4-bit field as an unsigned field. (In a subsequent section will treat it as a signed field, which means that we're interested in an answer of "bbbbbcde", instead of "0000bcde". A signed field means that the most significant bit represents the sign rather than magnitude as the other bits do.)

Unsigned Field Extraction

There are at least 3 techniques to do this unpacking, so we'll describe those. The first uses a combination of shifts: (1) a Left-shift followed by a Right-shift. The other two use a combination of (2) Right-Shift followed by a masking And operation, or (3) vice versa.

Approach (1u) — Left-shift, then Right-shift

First, we left-justify the field in our CPU register. In our case, this means shifting it left by 1 bit, or register high bit number - field start position = 7 - 6 = 1.

76543210     bit position (in decimal)

abcdefgh     our input byte, shown in binary as variables
       1     shift amount in decimal
-------- <<  logical left shift operation
bcdefgh0     result

The field we are interested in is now left-justified in our 8-bit CPU register. This left justification has effectively removed any bits in higher positions than we are interested in. Next, we shift right to right-justify the field, which requires a shift right of 4 bits, which comes from the field's length: register size - field length = 8 - 4 = 4.

bcdefgh0     intermediate output/input
       4     shift amount in decimal
-------- >>  logical right shift operation
0000bcde     result

Because we want the item right justified when we're done, we do the right shifting last.

Approach (2u) — Right-shift, then masking And

First, the right shift:

abcdefgh     our input byte, shown in binary as variables
       3     shift amount in decimal
-------- >>  logical right shift operation
000abcde     result

Next, the masking And. The constant mask uses 1's where we want to keep bits from other input and 0's where we want to ignore bits from the other input. In this case our mask value is 15 in decimal, but the binary rendition is more illuminating.

000abcde     intermediate output/input
00001111     constant mask for And operation
-------- &   And operation
0000bcde     result

Approach (3u) — Masking And, then Right-shift

First, the masking And. The constant mask uses 1's where we want to keep bits from other input and 0's where we want to ignore bits from the other input. In this case our mask value is 120 in decimal, but the binary rendition is more illuminating.

abcdefgh     our input byte, shown in binary as variables
01111000     constant mask for And operation
-------- &   And operation
0bcde000     result

Next, the right shift:

0bcde000     intermediate output/input
       3     shift amount in decimal
-------- >>  logical right shift operation
0000bcde     result

To compare these sequences we have to understand a bit about the processor we're using; however all of these sequences are two instructions on most processors, so roughly comparable.

Some (older) processors shift more slowly the larger the shift count, which would mean that shift-heavy solutions would be slower; however, that isn't usually the case these days. Certain sequences use shorter immediate (constant) values than others, and this can result in shorter instructions, which is a space and sometimes speed advantage. For example, Approach (3) uses a constant of 120, for the mask, whereas Approach (4) uses a constant of 15 for the mask.

Signed Field Extraction

For signed field extraction, we need to replicate the field's sign bit across the rest of the CPU register. This means that where we had 0's in the unsigned field extraction, "0000bcde", we need the sign bit replicated instead: "bbbbbcde". In some sense this places the field's sign bit into the CPU register's sign bit, though for correctness, this bit is not just placed there but replicated to all the rest of the high bits in the register.

The approach here is that we will first left-justify the value in the CPU register using a Left-shift, and then, right justify the value as in (1u) — but this time using an Arithmetic Right-shift, which keeps the sign bit as is, instead of shifting zero bits into the leftmost bit (as with Logical Right-shift). We need to position the field's sign bit in the registers sign bit, using the left shift, in order to productively use the Arithmetic Right-shift operation for sign extension. Essentially, we left justify the field in the CPU register, then right justify it using arithmetic shifting.

Approach (1s) — Left-shift, then Right-shift

First, we left-justify the field in our CPU register. In our case, this means shifting it left by 1 bit, or register high bit number - field start position = 7 - 6 = 1.

76543210     bit position (in decimal)

abcdefgh     our input byte, shown in binary as variables
       1     shift amount in decimal
-------- <<  logical left shift operation
bcdefgh0     result

The field we are interested in is now left-justified in our 8-bit CPU register. Next, we shift right to right-justify the field, which requires a shift right of 4 bits, which comes from the field's length: register size - field length = 8 - 4 = 4.

bcdefgh0     intermediate output/input
       4     shift amount in decimal
-------- >>  arithmetic right shift operation
bbbbbcde     result

Masking Operations

Next we'll cover each bit manipulation operation. Note that we are using constant masks, and that the mask chosen is arbitrary. Masks can have consecutive 1's (as in 00111100), or mixed 1's and 0's: as in 01011010.

And Operation

The And operation using a constant mask: the result passes through the variable input to the result where the constant mask has a 1, and clears the result where the mask has a 0.

abcdefgh     our input byte, shown in binary as variables
01011010     constant mask
-------- &   And operation
0b0de0g0     result

This operation is useful for extracting information of interest, while ignoring information that is not of interest.

Or Operation

The Or operation using a constant mask: the result passes through the variable input to the result where the constant mask has a 0, and sets the result where the mask has a 1.

abcdefgh     our input byte, shown in binary as variables
01011010     constant mask
-------- |   Or operation
a1c11f1h     result

This operation is useful for setting (to 1 or true) some bits while leaving other, neighboring bits unaffected.

Xor Operation

The Xor operation using a constant mask: the result passes through the variable input to the result where the constant mask has a 0, and complements the result where the mask has a 1. Here we're using the notation that A is the complement of a. that is, A = ~a, B = ~b, and so forth.

abcdefgh     our input byte, shown in binary as variables
01011010     constant mask
-------- ^   Xor operation
aBcDEfGh     result

This operation is useful for toggling some bits while leaving other, neighboring bits unaffected.

Complement Operation

The Complement operation reverses every bit. It is like using Xor with a constant mask of all 1's

abcdefgh     our input byte, shown in binary as variables
-------- ~   Complement operation
ABCDEFGH     result

This operation is useful for inverting a mask or other value, and also used in negation.

Negation Operation

Most processors have an operation to perform negation on signed numbers, however, the negation construct for 2'c complement representation of signed numbers, is to apply the complement and then add one (increment).

Shifting Operations

In these examples, we shift by only one bit. Though modern hardware can shift multiple bit positions in a single cycle, a shift result for a higher shift count can still be thought of as multiple consecutive 1-bit shifts.

Note that some hardware has anomalies when using a shift count that is equal to or larger than the CPU register size, or when using a variable shift amount that turns out to be zero. (I view these as unfortunate inconsistencies, shifting by zero, for example, should simply return the input value rather than something unexpected).

Left Shift

The Left-shift operation shifts the input by the shift count, and inserts zeros into vacated bits on the right.

abcdefgh     our input byte, shown in binary as variables
       1     shift count in decimal
-------- <<  Left shift operation
bcdefgh0     result

This shift operation is also the same as multiplication by a power of 2, where the power is the shift count. This is a more efficient way to multiple by such values, if you know in advance you're dealing with a power of 2.

Logical Right-shift

The logical Right-shift operation shifts the input by the shift count, and inserts zeros into vacated bits on the left.

abcdefgh     our input byte, shown in binary as variables
       1     shift count in decimal
-------- >>  Logical Left shift operation
0abcdefg     result

This also divides by a power of 2, where the power of 2 is the shift count.

Arithmetic Right-shift

The arithmetic Right-shift operation shifts the input by the shift count, and inserts copies of the sign bit into vacated bits on the left.

abcdefgh     our input byte, shown in binary as variables
       1     shift count in decimal
-------- >>  Logical Left shift operation
aabcdefg     result

Of particular note here is that the sign bit is kept and/or replicated depending on how we look at it. Larger shift counts will result in a string of aaa's starting with the high bit. This is the result we want in order to properly sign extend a smaller value to the size of a CPU register, for example.

This shift operation is similar to a divide by a power of 2 but has an issue with an input value of -1 as we would expect the result of -1/2 to be 0 when truncated to integer, but using arithmetic shifting it simply stays as -1 instead of going to 0.

Dynamic Mask Generation

1 << n

With 0 <= n <= register size (e.g. 32), this construct computes a value with a single bit set. Here's a table of 1<<n for n:

  1<<n        n        1<<n (decimal)
 00000001     0           1
 00000010     1           2
 00000100     2           4
 00001000     3           8
 00010000     4          16
 00100000     5          32
 01000000     6          64 
 10000000     7         128

Notice that the 1's march across the word.

(1 << n) - 1

This construct generates a set of consecutive 1's.

 (1<<n)-1     n    (1<<n)-1 (decimal)
 00000000     0           0
 00000001     1           1
 00000011     2           3
 00000111     3           7
 00001111     4          15
 00011111     5          31
 00111111     6          63 
 01111111     7         127

((1 << n) - 1) << m

Let's say that m is one. This generates a bit mask of consecutive ones at a position other than right justified.

((1<<n)-1)<<1 n
 00000000     0
 00000010     1
 00000110     2
 00001110     3
 00011110     4
 00111110     5
 01111110     6
 11111110     7

The complement of each of these can form a useful mask:

~ (1 << n)

~(1<<n)      n      
11111110     0
11111101     1
11111011     2
11110111     3
11101111     4
11011111     5
10111111     6
01111111     7

~((1 << n) - 1)

 ~((1<<n)-1)     n 
 11111111        0
 11111110        1
 11111100        2
 11111000        3
 11110000        4
 11100000        5
 11000000        6
 10000000        7

~(((1 << n) - 1) << m)

~(((1<<n)-1)<<1) n
 11111111        0
 11111101        1
 11111001        2
 11110001        3
 11100001        4
 11000001        5
 10000001        6
 00000001        7

Using dynamic mask notation for constant mask creation

In C and other languages, you can use the above dynamic mask generation with constant values for n. That can make it easy to generate masks of interest. For example, a 4-bit mask (e.g. for the low nibble) can be generated using the constant (1 << 4)-1. A mask for the upper nibble of a byte is (the former) << 4.

Other Operations

Some CPUs have direct signed and/or unsigned field extraction instructions, which can replace the above sequences.

Many CPUs also have sign extension and zero extension operations for 8-bit and 16-bit inputs to convert them to 32-bit values.

Older CPUs may have only 1-bit shift operations. In such cases, shift operations need to be used in sequence to obtain the desired shift. If a processor is absent a variable shift amount, a loop of 1-bit shifting can be used.

See Also https://graphics.stanford.edu/~seander/bithacks.html

This post first appear on Erik Eidt's "Discussions on Programming Models" at https://erikeidt.blogspot.com/2018/04/bit-twiddling-understanding-bit.html

Bit manipulation Masking Binary

Report

Enjoy this post? Give Erik Eidt a like if it's helpful.

Erik Eidt

Assembly Language, Machine Code, Processor Architecture

I am an assembly language expert, having mentored recently in MIPS, LC-3&4, MARIE, PEP/8&9, others. I will explain the concepts and fundamentals that you need to know. For example: how the processor works; how assembly languag...

Discover and read more posts from Erik Eidt

get started