Android 14 uses "nonsensical" logic to calculate space usage on smartphones

Alfonso Maruccia

Posts: 1,025   +302
Staff
Facepalm: While embedded storage in modern smartphones has grown considerably, the software side of things still tends to make weird or outright incorrect calculations about available space. This problem persists in the upcoming Android release and is present in most third-party reskinned versions of the OS.

Both old and current versions of Android suffer from a peculiar bug concerning how the OS calculates storage space usage on mobile devices. Android specialist Mishaal Rahman, who uncovered this issue earlier this year, emphasized that even in the impending Android 14 release, the method Android uses to calculate space occupied by the OS "system" files remains fundamentally flawed.

The way Google's operating system calculates "system" files is utterly illogical, as explained on X. When new files are added to a smartphone's built-in storage, Android categorizes them as part of the "system" folder if they can't be placed in other categories like images, videos, documents, and so on. In simpler terms, Rahman pointed out that Android calculates the "system" component by merely subtracting the storage attributed to everything else from the total storage space that's currently in use.

Even user-created files that reside in the /data/media directory, which are unlikely to be system files at all, are classified by Android as part of the "system." Rahman showcased the bug by executing a shell command to generate a 3GB file filled with random data. Following the file's creation, the "System" category increased by 3GB.

Besides incorrectly reporting the available space, the bug also impacts how the "Files" file explorer app calculates storage, likely because it employs the same flawed logic as the OS. Third-party "reskinned" versions of Android are also affected by the bug, with one notable exception: Samsung's One UI 6, according to Rahman, provides an accurate report of how files consume space on a mobile device.

Furthermore, as the journalist explained, Android has another issue with storage space reporting. Google employs the "gibibyte" unit for storage calculations, which equals 1024^3 bytes. In contrast, OEM manufacturers advertise storage capacity in "gigabytes," which equals 1000^3 bytes after the International Electrotechnical Commission (IEC) officially adopted the new prefix standards in 1998.

"Gibibytes" is the correct definition for representing the space that's actually available on a storage unit, but it can mislead users about the space their phone's manufacturer advertised. Rahman pointed out that this issue persists in Android 14, potentially leading users to unnecessarily perform factory reset procedures in an attempt to reclaim additional space that never actually existed.

Permalink to story.

 
Gigabytes stored as 10^9 bytes is stupid, products should advertise using the 2^30 bytes regardless of the metric prefix. No one is going to say "gibby-bites" lol. So 128 GB should equal exactly 2^37 bytes. The proper "giga" byte amount using base 10 would make it 137,438,953,472 bytes, which is a stupid number to advertise. Otherwise why use 2, 4, 8, 16, 32, 64, 128, 256, and 512 units of storage at all? Those units are what all the storage manufacturers use, so any advertisement of 128 "billion" bytes is extremely disingenuous.
 
Gigabytes stored as 10^9 bytes is stupid, products should advertise using the 2^30 bytes regardless of the metric prefix. No one is going to say "gibby-bites" lol. So 128 GB should equal exactly 2^37 bytes. The proper "giga" byte amount using base 10 would make it 137,438,953,472 bytes, which is a stupid number to advertise. Otherwise why use 2, 4, 8, 16, 32, 64, 128, 256, and 512 units of storage at all? Those units are what all the storage manufacturers use, so any advertisement of 128 "billion" bytes is extremely disingenuous.

Because it's convenient for them. They create storage products and sell it to people, so, who cares! They get to sell you a "128GB" storage device that will NEVER fit "128GB" of data, obviously because the hardware and software use different measuring systems, but that's on you as the customer to learn/know. Enter the "GiB" term, which I have no clue which is 101% of the time (I'd have to Google it) to become the "solution" that simply creates even more problems.

The main issue is that the IEC had to chime in and "define" the metric system into bytes storage, which is flawed, to say the least:
- a byte should have had 10 bits, and the bits would be dB (deci-bytes, which also collides with decibels); that 8bits=1byte is totally un-metric
- not sure what 1/10 of a bit would be, a cB (centi-byte) if you will
- smaller units are just... pB (pico-byte? 1Bx10^-12)
- not to mention that we're missing the daB and hB units everywhere
- where does the nibble fit into their system?
 
Last edited:
Because it's convenient for them. They create storage products and sell it to people, so, who cares! They get to sell you a "128GB" storage device that will NEVER fit "128GB" of data, obviously because the hardware and software use different measuring systems, but that's on you as the customer to learn/know. Enter the "GiB" term, which I have no clue which is 101% of the time (I'd have to Google it) to become the "solution" that simply creates even more problems.

The main issue is that the IEC had to chime in and "define" the metric system into bytes storage, which is flawed, to say the least:
- a byte should have had 10 bits, and the bits would be dB (deci-bytes, which also collides with decibels); that 8bits=1byte is totally un-metric
- not sure what 1/10 of a bit would be, a cB (centi-byte) if you will
- smaller units are just... pB (pico-byte? 1Bx10^-12)
- not to mention that we're missing the daB and hB units everywhere
- where does the nibble fit into their system?
Both the hardware and software use base 2. That’s literally how every computer works. This is a requirement and it boils down to the very transistor-level. It also applies to the software, and I’m saying this as a front-end software engineer.

You have no idea what you’re talking about when you say that a byte should’ve had 10 bits—that would’ve changed nothing. If it did, then a byte would be represented by two base 32 characters and that would make things incredibly complicated for programmers: 0-9 and A-V (digits 10-31). 8 bits means a byte can be represented by two base 16 characters (hexadecimal) or 0-9 and A-F (10-15). You’d be requiring programmers to memorize a much longer alphabet index, where a majority of characters are now unfamiliar alphabetic digits (A-V) instead of a minority (A-F).

In case you didn’t know hexadecimal vs base-32 are directly derived from the byte size. 8 bits (2^8) are broken up into half, or 2^4 * 2^4, or 16*16 (0-255 or 00-FF). Sometimes other sizes are used to store some amount of data, but they’re always represented with a base that is a multiple of 2. The reason is because it’s unquestionably a better number system, especially for computers. It has way more factors instead of just 2*5. It’s also the reason why so many UOMs rely on some multiple of two or four. The only reason 10 is used is because we have 10 fingers. That’s extremely unscientific.

So no, bytes will never change to 8 bits—in fact it’s more likely that humans’ numbering system will change from decimal to octal or hexadecimal.
 
Both the hardware and software use base 2. That’s literally how every computer works. This is a requirement and it boils down to the very transistor-level. It also applies to the software, and I’m saying this as a front-end software engineer.

You have no idea what you’re talking about when you say that a byte should’ve had 10 bits—that would’ve changed nothing. If it did, then a byte would be represented by two base 32 characters and that would make things incredibly complicated for programmers: 0-9 and A-V (digits 10-31). 8 bits means a byte can be represented by two base 16 characters (hexadecimal) or 0-9 and A-F (10-15). You’d be requiring programmers to memorize a much longer alphabet index, where a majority of characters are now unfamiliar alphabetic digits (A-V) instead of a minority (A-F).

In case you didn’t know hexadecimal vs base-32 are directly derived from the byte size. 8 bits (2^8) are broken up into half, or 2^4 * 2^4, or 16*16 (0-255 or 00-FF). Sometimes other sizes are used to store some amount of data, but they’re always represented with a base that is a multiple of 2. The reason is because it’s unquestionably a better number system, especially for computers. It has way more factors instead of just 2*5. It’s also the reason why so many UOMs rely on some multiple of two or four. The only reason 10 is used is because we have 10 fingers. That’s extremely unscientific.

So no, bytes will never change to 8 bits—in fact it’s more likely that humans’ numbering system will change from decimal to octal or hexadecimal.
Yo, chill, I wasn't attacking you. I was just extending the "logic" that IEC applied to k, M, G, etc. I know how the binary system works, and the basis is the bit, not the byte, just with that the whole idea of "international unit system" in storage makes for an awful sales pitch: here's your device with 128Gb... how much that "means" would depend on your "word size". There's also the problem with negative exponents: what does a 1b x 10^-3 means (mb)?
And the main problem: the mismatching prefixes, the whole point of your original comment, the article, and the whole debacle. In the metric system, consecutive prefixes are 10 or 1000 units apart for linear scales, not related at all with the exponential growth of "2". The whole point of my comment was to show that the international unit system (the metric system) is a BAD solution for counting bits/bytes. The fact that the prefixes are shared between systems pushed some empty-headed illuminated to "make them the same" and that's why we're here.
 
Back