Mode7 data takes the first half of VRAM.
The tilemap and charactermap are interleaved, with the character data being in
the high byte of each word and the tilemap data being in the low byte (note
that in hardware, VRAM is set up such that odd bytes are in one RAM chip and
even in another, and each RAM chip has a separate address bus. The Mode 7
renderer probably accesses the two chips independantly). The tilemap is 128x128
entries of one byte each, with that one byte being simply a character map
index. The character data is stored packed pixel rather than bitplaned, with
one pixel per byte.
[...]
When bit 6 of $2133 is set, you get a related mode known as Mode 7 EXTBG. In
this mode, you get a BG2 with 128 colors, which uses the same tilemap and
character data as BG1 but interprets the high bit of the pixel as a priority
bit.
[...]
Note that BG1, being a 256-color BG, can do Direct Color mode (in this case, of
course, there is no palette value so you're limited to 256 colors instead of
2048). BG2 does not do direct color mode, since it is only 7-bit.