0A Bit Order, Byte Order, and other B.S.

Bit Order, Byte Order, and other B.S.

Sat, 22 Feb 2026

Introduction
What Really Matters

Monotonic Bit-Numbering
Numerical Consistency
Readable Hex Dumps
Only 4 Encodings?

Example Code
My Favorite Option

Introduction

I often work with values packed into a small number of bytes using fixed-point; Generally this is used to pack as much information as possible into just a few bytes, either on embedded devices that use protocols like SPI or CAN bus, or when doing over-the-air communications through radios. Whatever the case may be, a common issue is that the low-level details of how multiple values are being packed into the same bytes are annoyingly under-specified. When digging into these details, I've had multiple disagreements about what encoding makes the most sense.

Part of the problem is that there are many competing terms for the different ways of doing these encodings. For example, many articles refer to "left-most" or "right-most" bits. These terms only make sense in context with a correct accompanying visualization of the encoded data, making them ill-suited to general discussions of this topic.

Another issue is that many standards that are supposed to describe these types of encodings fail to fully specify the information you need to decode the values. For example, when working with CAN bus, a common file format is .dbc files, that specify how to decode fixed point values from a CAN message with a specific 11 bit ID and up to 8 bytes of payload. DBCs specify "signals", fixed point encoded values within the payload bytes. For the purpose of this article, only 3 fields are relevant, highlighted below:

  SG_ EngineSpeed : 24|16@1+ (0.125,0) [0|8031.875] "rpm" Vector__XXX
                    │  │  │ 
         start bit──┘  │  │
                       │  └───endianness (@1 == Little-Endian)
        bit length─────┘

So DBCs specify the start bits, bit length, and endianness. But this is not enough to decode values! There is no information about how start bits not divisible by 8 should work, which is a very common case in these files. Tools that use .dbc files have you specify this either as "OSEK" which as far as I can tell always means LSB0, or as "Follows Endianness", which is MSB0 for Big-Endian and LSB0 for Little-Endian. Some tools have you specify these properties on a per-signal basis; others assume that it is global to the entire file. If it's the latter when you needed the former, you will have to invert start bits manually to make the decoding make sense, which is tedious and easy to mess up.

When trying to choose a standard encoding for a codebase, I've had disagreements about what endianness made sense. Like in the case of .dbc files, this was because critical information was not being communicated: how to handle bit numbering within bytes. As someone who preferred little-endian, I had assumed any encoding would use LSB0. Similarly, my coworker who was advocating for big-endian assumed any encoding would use MSB0. We each thought the other was crazy until we got out the whiteboard and really dug into the details. In this article, I'll explain what I believe are the only 2 properties of these encodings that really matter, and dig into the different properties that the 4 resulting encoding options have.

What Really Matters

To cut through all the confusing conventions, terminology, and diagrams, it's important to focus on the numeric value of the number being encoded; 0x123 has a specific value that will be preserved by the compiler regardless of machine endianness or human-level labels like "left-most" or "right-most". There are only two choices that matter, and here is my best explanation of each:

Byte Order: if the numeric value of a byte increases with it's pointer address, the byte order is Little-Endian (LE); if it decreases, then it's Big-Endian (BE).

Bit Order: within a byte, if the bit number increases with the numeric value, then bit 0 is the Least Significant Bit (LSB0); if it decreases, then bit 0 is the Most Significant Bit (MSB0).

Taking every combination of these two gives us a table of four possiblities, with their properties summarized below:

	Little-Endian	Big-Endian
LSB0	monotonic bit-numbering, numerically consistent	numerically consistent
MSB0	???	monotonic bit-numbering, readable hex-dumps

Monotonic Bit-Numbering

For the (LE, LSB0) and (BE, MSB0) encodings, the bit number changes in the same direction with the numeric value for both the bytes and bits. For both these variants, if you plot the bit number against the byte index + bit value within a byte, you get a straight line either increasing or decreasing respectively. For the other encodings, you instead get a zig-zag pattern that is trickier to think about when encoding data across byte boundaries:

Numerical Consistency

The encodings that use LSB0 are numerically consistent if their endianness matches platform endianness, in that the bit numbers align with their numeric values. For example, when using a numerically consistent encoding, serializing 0x100 with a start bit of 0 will result in only bit 8 being set.

In contrast MSB0 encodings will use numerically non-contiguous bits when encoding across byte boundaries. It's hard to say this is a disadvantage in practice; in the example code for reading bits this adds a very small amount of complexity to the code, which doesnt really matter after you implement that function in your codebase and hopefully never have to think about it again.

Readable Hex Dumps

The (BE, MSB0) approach has an advantage that the others don't, which is that if you do a hex dump of the encoded data, values that have been encoded will be visible in the same form as their associated numeric value:

read start=4, len=16:
  hex dump: EF CD AB 
(BE, MSB0): 0xFCDA
(BE, LSB0): 0xECDB
(LE, LSB0): 0xBCDE
(LE, MSB0): 0xACDF

This property has arbitrarily assigned to (BE, MSB0) by the following coincidence: we read numbers right-to-left, but print hex dumps with the bytes going left to right, like english text. These two ways of writing the numbers mirror each other, and using the (BE, MSB0) encoding mirrors them again (as shown in the plot above). These two mirror-ings cancel out, giving us the same sequence in both representations.

Personally, I don't assign much value to this property. Unless the values are nibble-aligned and the data is mostly zero like in the above example, hex-dumps of each of the encoding types all become equally readable; they are all hard to read. This would be a bit better with binary dumps, but then the information density is much lower.

Only 4 Encodings?

I'm assuming above that when serializing values, the LSB and MSB are written to the LSB and MSB of the storage region; this excludes "bit-reversed" representations, like the below example of encoding the 8-bit value 0xF5, starting at bit 4:

The only place where I've seen bit-reversal is with misconfigured serial ports where a device is sending data MSB first but the host machine is reading it as LSB first. I imagine that any similar wire format could have similar issues, but it seems best to just configure the host machine appropriately. Considering bit-reversal would add (LE, MSB0, reversed) and (BE, LSB0, reversed) options that have more of the desireable properties above, but would also add the non-sensical (LE, LSB0, reversed) and (BE, MSB0, reversed). To keep this article focused, I am not considering these additional options.

Example Code

To demonstrate, below is some example code to read bits in each of the 4 encoding variants. Each function makes assumptions for clarity, that should be checked or guaranteed in production. I have elided these checks because when I've written usage code with these functions those properties have always been guaranteed by the time I'm calling the function. Here are the problems that can result:

Undefined behavior on shifts if len > 64.
Buffer overflow if (start + len + 7)/8 >= sizeof(buf).

Each of the function decodes in 3 parts:

Partial bits in start byte. Early exit for small values.
Full bytes.
Partial bits in last byte.

Each of the functions puts the bits directly in their final location; This is not the only way to "assemble" the bits, but seemed to make the differences between the 4 versions the clearest. All of the functions should work regardless of platform endianness; everything is handled byte-by-byte, and the bit operations are based on numeric values so will set the correct bits regardless.

// Common defines
#include <stdint.h>
typedef uint64_t u64;
typedef uint32_t u32;
typedef uint16_t u16;
typedef uint8_t u8;

u64 read_bits_le_lsb0(u8 *buf, u32 start, u8 len) {
  u64 v = 0;
  u8 i = start >> 3;
  u8 n = 0;

  u8 ir = start & 0x7;
  if (ir) {
    n = 8 - ir;
    v = buf[i++] >> ir;
    if (len < n) {
      return v & ((1<<len)-1);
    }
  }

  while (n+8 < len) {
    v |= buf[i++] << n;
    n += 8;
  }

  u8 r = len - n;
  if (r) {
    v |= (buf[i++] & ((1<<r)-1)) << n;
  }

  return v;
}

u64 read_bits_le_msb0(u8 *buf, u32 start, u8 len) {
  u64 v = 0;
  u8 i = start >> 3;
  u8 n = 0;

  u8 ir = start & 0x7;
  if (ir) {
    n = 8-ir;
    v = buf[i++] & ((1<<n)-1);
    if (len <= n) {
      return v >> (n-len); 
    }
  }

  while (n+8 <= len) {
    v |= buf[i++] << n;
    n += 8;
  }

  u8 r = len - n;
  if (r) {
    v |= (buf[i] >> (8-r)) << n;
  }

  return v;
}

u64 read_bits_be_lsb0(u8 *buf, u32 start, u8 len) {
  u64 v = 0;
  u8 i = start >> 3;
  
  u8 ir = start & 0x7;
  if (ir) {
    u8 n = 8 - ir;
    v = buf[i++] >> ir;
    if (len > n) {
      len -= n;
      v = v << len;
    } else {
      return v & ((1<<len)-1);
    }
  }

  while (len >= 8) {
    len -= 8;
    v |= buf[i++] << len;
  }

  if (len) {
    v |= buf[i] & ((1<<len)-1);
  }

  return v;
}

u64 read_bits_be_msb0(u8 *buf, u32 start, u8 len) {
  u64 v = 0;
  u8 i = start >> 3;
  
  u8 ir = start & 0x7;
  if (ir) {
    u8 n = 8-ir;
    v = buf[i++] & ((1<<n)-1);
    if (len > n) {
      len -= n;
      v = v << len;
    } else {
      return v >> (n-len); 
    }
  }

  while (len >= 8) {
    len -= 8;
    v |= buf[i++] << len;
  }

  if (len) {
    v |= buf[i] >> (8-len);
  }

  return v;
}

All of these implementations are about equally complex. The approach taken is byte-by-byte, which I would expect to be faster than going bit-by-bit, but other than that this code has not been profiled or optimized for speed at all. I have tested each function against a variety of start/len inputs with different bit patterns. My code for doing that is not polished, but you can find it on github. These functions are a very generic solution; for simple use-cases you could write the masks and shifts manually, or if you are generating code to decode a fixed structure like a .dbc you could generate the specific masks and shifts for each signal and avoid the need for any conditionals.

My Favorite Option

My preference whenever you have the choice of what encoding to use is (LE, LSB0). Little-Endian is the most common endianness on modern machines. This means we get the consistency benefit of having our encoding endianness match platform endianness if we choose LSB0. The resulting bit-numbering is monotonic so the range of bits to encode any given value will always occupy a numerically-contiguous region. Before writing this article I thought that for these two reasons that it was an obvious choice. I feel less strongly after writing the article; each approach has shown advantages and disadvantages in understanding, and the code for reading values had a similar level of complexity for each variant. But ultimately to me (LE, LSB0) still makes the most sense.

To encourage its use, here is an additional code sample for writing bits with (LE, LSB0), to go with the read_bits_le_lsb0 sample above. The structure is very similar, although writing takes extra care to avoid overwriting adjacent bit packed values.

void write_bits_le_lsb0(u8 *buf, u32 start, u8 len, u64 value) {
  u8 i = start >> 3;

  u8 ir = start & 0x7;
  if (ir) {
    u8 n = 8 - ir;
    if (n < len) {
      buf[i] = (buf[i] & ((1<<ir)-1)) | (((u8)value) << ir);
      i++;
      len -= n;
      value >>= n;
    } else {
      u8 mask = ((1<<len)-1) << ir;
      buf[i] = (buf[i] & ~mask) | ((value << ir) & mask);
      return;
    }
  }

  while (len >= 8) {
    buf[i++] = ((u8)value);
    len -= 8;
    value >>= 8;
  }

  if (len > 0) {
    u8 mask = ((1<<len)-1);
    buf[i] = (buf[i] & ~mask) | (value & mask);
  }
}

Feel free to email me any comments about this article: contact@loganforman.com