When you read a pixel from a BG layer, you get the resulting palette index. That index is combined with the 3 "palette subgroup" bits of the pixel's tilemap entry. Usually it's subgroup*colordepth+index, but in Mode0 it's subgroup*colordepth+index+(BG-1)*32.
When colordepth is 256 (that applies to BG1 in Mode3, Mode4 or Mode7), subgroup would be useless since there are only 256 colors in the global palette.
Therefore, in
DirectColor mode ($2130.0=1) the bits are combined like this:
index (8-bit) = BBGGGRRR
subgroup (3-bit) = bgr
result (15-bit) = BBb00GGGg0RRRr0
Note that in Mode7 there are no subgroup bits, so it still uses only 256 colors. (Or 255, since index=0 means "transparent pixel".)