A Simple Python IO Trick


I thought using Python’s gzip is quite straight forward, however the IO performance almost doubled with an extra BufferedWriter:

import io
import gzip

def export_to_json_gz(schema):
  json_file = "test.json.gz"
  with io.BufferedWriter( gzip.open(temp_dir + json_file, 'wb') ) as gzfile:
    for row in stream_table_data(schema):
      ujson.dump(row, gzfile, ensure_ascii=False)
      gzfile.write('\n')

Now the bottle neck is ujson, how can I make that part faster?

🙂