Understanding Endianness
Understanding Endianness
What is Endianness?
Endianness refers to the order in which bytes are arranged within larger data types when stored in memory or
when transmitted over networks. It determines how data is interpreted and processed by a computer system.
Why do you need to understand Endianness?
Understanding endianness is crucial for ensuring correct data interpretation across different hardware
architectures and in network communication. Misinterpretation of byte order can lead to data corruption, bugs,
and system crashes. Ensuring compatibility and correctness in data handling across different systems and
networks necessitates a deep understanding of endianness.
What are the types of Endianness?
Little-Endian:
In a little-endian system, the least significant byte (LSB) is stored at the smallest memory address. This means
that for a multi-byte data type, the byte representing the smallest value is stored first, followed by the next
significant byte, and so on. This ordering is often referred to as "LSB 0" because the LSB is stored at offset 0.
Example: For the 4-byte hexadecimal number 0x12345678, the memory representation in a little-endian system
would be:
This order places the LSB (78) at the smallest address (0x00).
This order places the MSB (12) at the smallest address (0x00).
The terms "little-endian" and "big-endian" originate from Jonathan Swift's "Gulliver's Travels," where they
describe factions that broke their eggs at different ends. In computing, these terms were popularized by
computer scientist Danny Cohen in his 1980 paper "On Holy Wars and a Plea for Peace," which discussed the
challenges of byte ordering in network communication.
Bit Order
Regardless of whether a system uses big-endian or little-endian byte order, the bits within each byte are
typically stored in big-endian order. This means that within a byte, the most significant bit (MSB) is on the left,
and the least significant bit (LSB) is on the right.
Endianness is determined by the architecture of the CPU and the conventions used in the software and protocols
that the system employs.
• CPU Architecture: Some CPUs are hardwired to use a specific endianness (e.g., x86 uses little-endian),
while others, like ARM, can operate in either mode (bi-endian).
• Network Protocols: Network protocols standardize on big-endian format (network byte order) to ensure
consistent data interpretation across diverse systems.
int main() {
printf("System is %s\n", detect_endianness());
return 0;
}
(unsigned short x = 1) and checks the first byte (char *c = (char*)&x). If the first byte is 1, the system is little-
endian; otherwise, it is big-endian.
How to Handle Endianness in Network Protocols?
Network protocols typically use big-endian format (network byte order) to ensure consistent data interpretation
across different systems. When sending data over a network, you convert it from host byte order to network
byte order using functions like htonl and htons. Similarly, when receiving data, you convert it from network
byte order to host byte order using ntohl and ntohs.
What is Bi-Endian?
Bi-endian processors can operate in either little-endian or big-endian mode. This flexibility allows them to
interact with different systems and networks seamlessly. An example of a bi-endian processor is the ARM
architecture, which can be configured to use either endianness depending on the application requirements.
How can we convert a 32-bit integer from Little-Endian to Big-Endian?
#include<stdio.h>
#include <stdint.h>
int main() {
uint32_t little_endian = 0x12345678;
uint32_t big_endian = swap_endian(little_endian);
printf("Little-endian: 0x%08x\n", little_endian);
Output:
int main() {
uint64_t host_val = 0x123456789ABCDEF0;
uint64_t network_val = htonll(host_val);
printf("Host byte order: 0x%016lx\n", host_val);
printf("Network byte order: 0x%016lx\n", network_val);
return 0;
}
Output:
Given a byte array in little-endian format, we might need to write a function in C to read a 32-bit integer from
the array and convert it to host byte order:
#include <stdint.h>
#include <stdio.h>
int main() {
uint8_t little_endian_data[4] = {0x78, 0x56, 0x34, 0x12};
uint32_t host_val = read_little_endian(little_endian_data);
printf("Read value: 0x%08x\n", host_val);
return 0;
}
Output:
Bonus:
How to transmit a 24-bit Integer as a 32-bit Integer?
- Packing a 24-bit Integer into a 32-bit Integer
• Little-endian Format: The 24-bit value is packed into the lower 3 bytes of the 32-bit container, with the
most significant byte (MSB) set to zero.
• Big-endian Format: The 24-bit value is packed into the upper 3 bytes of the 32-bit container, with the
least significant byte (LSB) set to zero.
- Unpacking a 24-bit Integer from a 32-bit Integer
• Little-endian Format: Extract the lower 3 bytes from the 32-bit container.
• Big-endian Format: Extract the upper 3 bytes from the 32-bit container.
int main() {
uint32_t val_24 = 0x123456; // Example 24-bit value
uint32_t packed_32 = pack_24_to_32_little(val_24);
uint32_t unpacked_24 = unpack_24_from_32_little(packed_32);
return 0;
}
Output:
int main() {
uint32_t val_24 = 0x123456; // Example 24-bit value
return 0;
}
Output:
int main() {
uint32_t val1_24 = 0x123456; // Example first 24-bit value
uint32_t val2_24 = 0x789ABC; // Example second 24-bit value
uint64_t packed_64 = pack_two_24_to_64_little(val1_24, val2_24);
uint32_t unpacked_val1_24, unpacked_val2_24;
unpack_two_24_from_64_little(packed_64, &unpacked_val1_24, &unpacked_val2_24);
return 0;
}
int main() {
uint32_t val1_24 = 0x123456; // Example first 24-bit value
uint32_t val2_24 = 0x789ABC; // Example second 24-bit value
uint64_t packed_64 = pack_two_24_to_64_big(val1_24, val2_24);
uint32_t unpacked_val1_24, unpacked_val2_24;
unpack_two_24_from_64_big(packed_64, &unpacked_val1_24, &unpacked_val2_24);
return 0;
}
Output:
Summarized:
Little-endian: Little-End first. Bytes are stored in reverse order, with the least significant byte first. LSB at
Lowest address, MSB at highest address.
Big-endian: Big-End first. Bytes are stored in natural order, with the most significant byte first. MSB at Lowest
address, LSB at Highest Address.