python batch text preprocess
A simple script to remove redundent white space in a text file:
import re
with open(fname, 'r') as input_f, open(fname_new, 'w') as output_f:
for line in input_f:
line_out = re.sub(' +', ' ', line)
output_file.write(line_out)
You can customize anything using this script, with regular expression.
Written on September 23, 2017