Skip to main content

Bit Manipulation

Bit manipulation represents the elegant and concise facet of algorithms. Apparently, it is based on how the computer stores the information. With such down-to-the-bottom level data structures or operations, the space usage is usually the minimum. The least usage of space makes the speed fast as well.

The unit is bit, which only has two possible values, 0 or 1 (which is exactly the reason that the base is 2, or binary). For the integer data type (int) in most computer system nowadays, there are 32 bits. 

Byte is different from bit. Each byte has 8 bits (so one integer has 4 bytes in size). Thus, the value range of a byte is in [0, 2^8 = 256) (if you know ACII code before, have you ever ask why the range is [0, 256)?). This connects to some very common words used daily,

 1 KB is (2^10 = 1024 ~ 10^3) byte

 1 MB is (2^10 = 1024 ~ 10^3) KB

 1 GB is (2^10 = 1024 ~ 10^3) MB

 1 TB is (2^10 = 1024 ~ 10^3) GB

If we have an integer array with a size of 10^6, the total size is 4*10^6 ~ 4 MB. If you have a movie which is about 100 GB in size, the number of bits in it can also be estimated (homework...).

Now let move on to the negative numbers, which is related to signed numbers. For the signed numbers, the bit on the most left is the flag bit. When it is 0, it is non-negative; when it is 1. it is a negative number. Thus, for the signed integer, the range is [-2^31, 2^31), or

INT_MIN = -2^31

INT_MAX = 2^31 - 1

If you look at the bits of INT_MAX, they are 0111...111, or one 0 + thirty-one 1's;

Then how about INT_MIN?

If this is the first time for you to think about this question, you may guess it may be thirty-two 1's. But actually it is not. It is 1000...000, or one 1 + thirty-one 0's. I know you may ask:

1. why?

2. if this is true, then what is the number with all bits of 1?

The first question is very good one, but I will not try to answer the question here, since it does involve some design logic back to old days of computer. There are good resources online about this. 

For the second question, the answer is -1. (What?) Such a fun world! isn't it?

The next topic is bit-wise operations. The common ones are: & (and), | (or), ^ (xor),  ~ (opposite or NOT), >> (left shift), and <<(rigth shift).

Besides the definitions, pay attention to the precedence of these operators. This will cause some hidden bugs in code if not handle correctly.

Finally we can go to the applications in coding! Yeahhh


Question List


Upper Layer


Comments

Popular posts from this blog

Segment Tree

Segment tree can be viewed as an abstract data structure which using some more space to trade for speed. For example, for a typical question with O(N^2) time complexity, the segment tree method can decrease it to O(N*log(N)).  To make it understandable, let us consider one example. Say we have an integer array of N size, and what we want is to query the maximum with a query range [idx1, idx2], where idx1 is the left indexes, and idx2 is the right indexes inclusive. If we only do this kind of query once, then we just need to scan through the array from idx1 to idx2 once, and record the maximum, done. The time complexity is O(N), which is decent enough in most cases even though it is not the optimal one (for example, with a segment tree built, the time complexity can decrease down to O(log(N))). However, how about we need to query the array N times? If we continue to use the naïve way above, then the time complexity is O(N^2), since for each query we need to scan the query range once...

Bit Manipulation - Example

  Leetcode 136 Single Number Given a non-empty array of integers nums, every element appears twice except for one. Find that single one. You must implement a solution with a linear runtime complexity and use only constant extra space. Constraints: 1 <= nums.length <= 3 * 10^4 -3 * 10^4 <= nums[i] <= 3 * 10^4 Each element in the array appears twice except for one element which appears only once. Analysis: If there is no space limitation, this question can be solved by counting easily. But counting requires additional space. Here we can use xor (^) operation based on some interesting observations:  A^A = 0, here A is any number A^0 = A, here A is any number Since all the number appears twice except one, then all the number appear even numbers will be cancelled out, and only the number appears one time is left, which is what we want. See the code below: class Solution { public: int singleNumber(vector<int>& nums) { int res = 0; for(auto ...

Rolling Hash

Rolling hash is one common trick used to increase efficiency of substring comparisons by compressing (or hashing) a string into a integer. After this step, we can compare two strings directly without comparing each chars. So the efficiency can be increased from O(N) to O(1). So how to implement the rolling hash? First we need to choose a base for the expansion and a modulo to mod. The basic formula is (suppose the window is n, and the rolling direction is from left to right), HashVal = (A1*p^(n-1) + A2*p^(n-2) + ... + An-1*p^1 + An*p^0)%mod where HashVal is the hash value, Ai is the ith element, p is the base, and mod is the modulo. To avoid collision as much as we can, p and modulo usually need to be large prime numbers. One corner case is that the base order in the above formula cannot be reversed. Or to be more clear, if the rolling direction is from left to right in an array, the first element should be in the highest order of the base, or times p^(n-1), and the last element times ...