Simplify the load and save interfaces. Replace the (data, data_cursor) tuples with a load_ctx object. Add an option to skip decoding tag names to unicode, improving speed and memory use if tag names are assumed to be ASCII. Make nbt.py and _nbt.pyx more consistent with each other. Use a similar method to import from _nbt.pyx as is seen in the standard library. Move pretty-print code to its own file and share it between implementations.