The Single Byte That Kills Your Exploit: Understanding Endianness

Postedabout 2 months agoActiveabout 2 months ago

andwati

34 points

25 comments

pwnforfunandprofit.substack.comTechstory

skepticalmixed

Debate

60/100

Exploit DevelopmentEndiannessComputer Architecture

Key topics

Exploit Development

Endianness

Computer Architecture

The article discusses how endianness can affect exploit development, but commenters question its relevance and the author's experience level, sparking a debate about the importance of endianness in exploit development.

Snapshot generated from the HN discussion

Discussion Activity

Active discussion

First comment

Peak period

72-84h

Avg / period

8.3

Comment distribution25 data points

Loading chart...

Based on 25 loaded comments

Key moments

01Story posted
Nov 9, 2025 at 8:56 AM EST
about 2 months ago
Step 01
02First comment
Nov 12, 2025 at 2:51 PM EST
3d after posting
Step 02
03Peak activity
18 comments in 72-84h
Hottest window of the conversation
Step 03
04Latest activity
Nov 13, 2025 at 3:52 PM EST
about 2 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (25 comments)

Showing 25 comments

MrBuddyCasino

about 2 months ago

4 replies

What first confused me about endianness is that it is about byte order, not bit order. The latter would have seemed more logical, or is this just me?

andwatiAuthor

about 2 months ago

1 reply

Learning this initially was confusing for me too, aren't we arranging bits?

kazinator

about 2 months ago

The words are divided into bytes. The bytes are rearranged, but the bits stay the same. The bits are not addressable and so represent pure binary values.

For instance given the word DEADBEEF, the least significant byte is EF.

That is a specific binary value: the value 239.

That value stays the same whether the bytes are EF BE AD DE in memory, or DE AD BE EF.

EF is just 239. We don't think about reversing the bits; they are not addressable. They have an abstract order determined by the binary system. The most significant bit of the value contributes 128 and so on.

The order matters when the bits have to be transmitted over a wire to another machine. Then we have to decide: do we transmit the low bit of EF first, or the high bit 1? If the two sides of the data link are inconsistent, then one side transmits 11101111 and the other receives 11110111, which is F7.

cobbal

about 2 months ago

2 replies

Little endian does appear strange at first, but if you consider the motivation it makes a lot of sense.

Little endian's most valuable property is that an integer stored at an address has common layout no matter the width of the integer. If I store an i32 at 0x100, and then load an i16 at 0x100, that's the same as casting (with wrapping) an i32 to an i16 because the "ones digit" (more accurately the "ones byte") is stored at the same place for both integers.

Since bits aren't addressable, they don't really have an order in memory. The only way to access bits is by loading them into a register, and registers don't meaningfully have an endianness.

IshKebab

about 2 months ago

1 reply

I'm not sure I've ever seen that actually come in to play. Little Endian is obviously the best Endian, but I don't think that argument really makes sense.

The most obvious argument is that little Endian is clearly the most natural order - the only reason to use Big Endian is to match the stupid human history of mixing LTR text with RTL numbers.

I've seen one real technical reason to prefer little endian (can't remember what it was tbh but it was fairly niche) and I've never seen any technical reasons to prefer big endian ("it's easier to read in a hex editor" doesn't count).

imtringued

about 2 months ago

It depends on the application. Big Endian is pretty good for networking and sorting. If you store the address in Big Endian, you can start doing streaming prefix matching, because the most significant address byte is arriving first. When you consider how many routers and switches a packet has to cross, any buffering or Endian conversion is going to increase latency.

anamax

about 2 months ago

> Since bits aren't addressable, they don't really have an order in memory.

Bits aren't addressable in the dominant ISAs today, but they were addressable by popular ISAs in the past, such as the PDP-10 family.

The PDP-10 is one of the big reasons why network byte order is big-endian.

That said, I forget whether the PDP-10 was big-endian or little-endian wrt bits.

jojomodding

about 2 months ago

1 reply

You can't address individual bits. There is no way of telling if the LSBit is "left" or "right" of the MSBit. So endianness can't be about that.

For bytes, you can distinguish them, as you can look at the individual bytes produced from a larger-than-byte store.

tadfisher

about 2 months ago

3 replies

Your CPU (probably) has left and right variants for shift and rotate operations, which is certainly an avenue for confusion. There's a "logical" bit order that these operations follow, which starts with the MSBit and ends with the LSBit, even when the physical connections are all parallel and don't really define a physical bit order.

NobodyNada

about 2 months ago

1 reply

> There's a "logical" bit order that these operations follow, which starts with the MSBit and ends with the LSBit

Well, normally when bits are numbered, "bit 0" is the least significant bit. The MSB is usually written on the left, (such as for left and right shifts), but that doesn't necessarily make it "first" in my mind.

addaon

about 2 months ago

> Well, normally when bits are numbered, "bit 0" is the least significant bit.

IBM being abnormal. Can't argue with that.

kazinator

about 2 months ago

The Common Lisp ASH (arithmetic shift) instruction has positive shifts for left, nefgative for right.

But even if machines were like this, it would not cause any interoperatibility issue. Because it is the data links between machines which ensure that bits are transmitted and received in the correct order, not the semantics of machine instructions.

It would be something to worry about when translating code from one language or instruction set to another.

Data link and physical protocols ensure that when you transmit byte with a certain decimal value like 65 (ASCII 'A') it is received as 65 on the other end.

The bits are "addressable" at the data link level, because the hardware has to receive a certain bit first, and the one after that next, and so on.

MrBuddyCasino

about 2 months ago

This exactly where the confusion came from for me. I was doing bit shifts in C to fix some DMA’d I2S output data on an ESP32, and then you have the two concepts: individual integers shifted left/right and the bytes themselves. Was very confusing back then, because I was thinking „individual bits l-r vs r-l“, which is incorrect.

kazinator

about 2 months ago

1 reply

The concept of order can only matter down to the units that are addressable.

Bits are typically not addressable, therefore do not have endiannness.

Bits are manipulated by special instructions, and those instructions are tied to arithmetic identities, due to the bits being interpreted as a binary number: like that a shift left is multiplication by 2.

In many instruction sets, the shift is a positive amount, and whether it is left or right is a different instruction. If it were the case that shifting one way is positive and the other way negative, then you have a kind of endiannness in that one machine uses positive for multiplication by powers of two, whereas another one for division. That would not result in an incompatible storage format though.

When data is transferred between machines as a sequence of bytes, there is a bit order in question, but it is taken care of by the compatibility of the data links.

Classic Ethernet is little endian at the bit level: the baseband pulses that represent the bit of a byte are sent into coax cable least-significant-bit first. RS-232 serial communication is the same: least significant bit first.

I think I²C is an example of a data link / physical protocol that is most-significant-bit first. So if you somehow hooked up an RS-232 end to I²C and got the communication to work, the bytes would be reversed.

We rarely, if ever, see bit endian effects because nobody does that --- transmit bytes between incompatible data links. If won't work for other reasons, like different framing bits, signaling conventions, voltages, speeds, synchronization methods, checksums, ...

Endianness of bits shows up in some data formats which pack individual bitfields of variable length.

Bitfields in C structures reveal bit endianness to some extent. What typically happens is that on a big endian target, bit fields are packed into the most significant bit of the underlying "cell" first. E.g.

   struct { unsigned a : 1, b : 1 };

the underlying cell might be the size of an int, like 32 bits. So where in the cell do "a" and "b" go? What you see under GCC is that on a big endian target, b will go to the most significant bit of the underlying storage cell, and b to the second most significant one. Whereas on little endian, a goes to the least significant bit, and b to the second least. In both cases, the bits map to the first byte, at the lowest address.

So in a certain sense, the allocation of members in C, as such, is little endian: the earlier struct members go to the lowest address, regardless of machine endian. It is probably because of that the bit order follows. Since putting bitfield a at the lowest address, as mandated by C field layout order, means that it has to go into the first byte, and that first byte is the most significant byte under big endian, it makes sense that the bit goes into the most significant bit position, for consistency.

That way we only have two possibilities to deal with for, say, a memory mapped status register:

    struct port_status_word {
  #if HAVE_BIG_ENDIAN
      unsigned transmit_ready : 1;
      unsigned data_received : 1;
      unsigned carrier_present : 1;
      // [ ... 29 more]
  #else
      // [ ... 29 more]
      unsigned carrier_present : 1;
      unsigned data_received : 1;
      unsigned transmit_ready : 1;
  #endif
   };

If we had separate byte and bit order, we would need two levels of #if nesting and four possibilities, which is even more ugly.

MrBuddyCasino

about 2 months ago

This is the most complete explanation I have read so far, which would have cleared up my confusion back then. Most texts on endianness (Wikipedia etc) don’t explain this properly.

scottlamb

about 2 months ago

2 replies

This is a weird take. I've never put together this kind of exploit, but still I know enough to not buy this. Do people ever really craft exploits that are perfectly valid except for using the wrong endianness?

> If you’ve ever crafted a perfect shellcode and ROP chain only to have your exploit immediately crash with a SIGSEGV(a signal sent by the operating system to a program when it attempts to access a protected or invalid memory location) or EIP(a 32-bit CPU register in the x86 architecture that holds the memory address of the next machine instruction to be executed) pointing to garbage, you’ve likely met the silent killer of beginners: Endianness.

Aren't there a million other ways to get addresses wrong?

> Using x86/x86_64 gadgets and packers on a MIPS/PowerPC target (different endianness and instruction set) will not work.

"and instruction set" is carrying a lot of weight here.

This isn't like a coin flip thing: even considering architectures with configurable endianness, in 2025 it's overwhelmingly likely both host and target are little-endian. And on old, big-endian platforms, that's just one of many things you have to get right.

benmmurphy

about 2 months ago

1 reply

it does seem like the audience the article is explicitly targeted for is an edge case. people who understand enough to be writing an exploit but are somehow unaware of their target architecture works.

but i guess the real target audience is probably people that are just starting out on CTFs and just trying to string stuff together without a proper understanding of the fundamentals. everyone has to start somewhere and i guess if people are just using packers and tools to generate exploit code then its quite easy to use the wrong flags and not know what is going on.

tadfisher

about 2 months ago

> Disclaimer: This article was written with AI assistance, for a bit of brainstorming and proofreading.

I suspect the target audience is "whoever will subscribe on Substack" more than someone who has ever written or contemplated writing shellcode. I'm seeing more and more articles like this that focus the prose on some weird subset-of-a-niche aspect of a subject, then end with a set of bullet points for fixing the problem as if this is something one regularly encounters.

Retr0id

about 2 months ago

People learn things in different orders. For many people, low-level CTF challenges are their introduction to computer architecture (a good way to learn if you ask me!) If so, endianness is a novel concept to them.

While I personally learned about endianness before writing my first exploit, I've definitely made endianness-related mistakes before.

kazinator

about 2 months ago

> you’ve likely met the silent killer of beginners: Endianness.

No, you've more likely made one or more of any number of bugs, or the conditions are not right in the target host for the exploit to work.

8cvor6j844qw_d6

about 2 months ago

Isn't this one of the first topic IT/CS students are taught?

imtringued

about 2 months ago

The introduction is like somehow ending up at the peak of mount Everest while forgetting to acclimatize to low oxygen or bringing oxygen with you.

andwatiAuthor

about 2 months ago

I write these like a personal notebook. I am just a beginner starting out, so if I skipped anything or don't get any nuances, I promise I'll get better.

dmitrygr

about 2 months ago

reads like AI slop. If you are writing shellcode, you DEFINITELY know the target arch and endianness.

eqvinox

about 2 months ago

I… really don't understand how you can end up writing exploit shellcode without having at some point on the long learning journey to get there encountered and learned about endianness.

And it should really only be a trip-up on bi-endian architectures, so… PowerPC and MIPS? I guess ARM if you're getting extremely exotic targets…

View full discussion on Hacker News

ID: 45865646Type: storyLast synced: 11/20/2025, 12:20:30 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN