In Python, you can read from and write to files without import any modules. Python has built-in function “open” which can be used to view and manipulate file objects. Let us see two ways of opening a file for reading/writing, for instance –
fp_in = open('/etc/hosts', 'r') # default is 'r', we can omit it. fp_out = open('/tmp/hosts', 'w') for line in fp_in: fp_out.write(line) fp_in.close() fp_out.close() with open('/etc/hosts') as fp_in: with open('/tmp/hosts') as fp_out: for line in fp_in: fp_out.write(line) # No need to close file, it is automatically closed at end of block.
One of the most common reasons given why you have to close the file object in the first case is to free up resources. But there is a second reason why you should always use ‘with’ keyword. After writing to a file object, and before closing it, the whole content from the source file might not appear in the destination file. This is because write uses buffering, and the changes will not be reflected until you run flush() or close() on the file object. Here is the help page for ‘write’ –
write(...) write(str) -> None. Write string str to file. Note that due to buffering, flush() or close() may be needed before the file on disk reflects the data written.
Let me demonstrate this by copying the /var/log/messages file to /tmp/message, the bigger the file, the more likely you will witness the effect of buffering. First i will take a copy of /var/log/messages to /var/log/messages.orig, and work with messages.orig as the former will most likely change in size as work along.
[root@kauai ~]# wc -l /var/log/messages.orig 10544 /var/log/messages.orig [root@kauai ~]# wc -l /tmp/messages 10542 /tmp/messages [root@kauai ~]# tail -1 /tmp/messages Nov 16 02:36:02 kauai syslog-ng[1605]: Log statistics; processed='src.internal(s_sys[root@kauai ~]# [root@kauai ~]# tail -1 /var/log/messages Nov 16 02:46:02 kauai syslog-ng[1605]: Log statistics; processed='src.internal(s_sys#2)=1787', stamp='src.internal(s_sys#2)=1416123362', processed='source(s_name_servers)=0', processed='destination(d_mesg)=7693', processed='destination(d_auth)=210', processed='source(s_sys)=12643', processed='global(payload_reallocs)=3568', processed='destination(d_mail)=12', processed='destination(d_kern)=5176', processed='destination(d_mlal)=0', processed='destination(d_ns_filtered)=0', processed='global(msg_clones)=0', processed='destination(d_spol)=0', processed='destination(hosts)=12643', processed='destination(d_boot)=0', processed='global(sdata_updates)=0', processed='center(received)=0', processed='destination(d_cron)=3653', processed='center(queued)=0'
Notice how the destination file /tmp/messages got truncated, it doesn’t even have a newline character at the end.
fp_out.close()
[root@kauai ~]# wc -l /tmp/messages 10544 /tmp/messages [root@kauai ~]# tail -1 /var/log/messages Nov 16 02:56:02 kauai syslog-ng[1605]: Log statistics; processed='src.internal(s_sys#2)=1788', stamp='src.internal(s_sys#2)=1416123962', processed='source(s_name_servers)=0', processed='destination(d_mesg)=7694', processed='destination(d_auth)=211', processed='source(s_sys)=12646', processed='global(payload_reallocs)=3570', processed='destination(d_mail)=12', processed='destination(d_kern)=5176', processed='destination(d_mlal)=0', processed='destination(d_ns_filtered)=0', processed='global(msg_clones)=0', processed='destination(d_spol)=0', processed='destination(hosts)=12646', processed='destination(d_boot)=0', processed='global(sdata_updates)=0', processed='center(received)=0', processed='destination(d_cron)=3654', processed='center(queued)=0'
This problem would not have happened if we had used the ‘with’ keyword, as it automatically does the flush() and close() for us at the end of the block statement –
with open('/var/log/messages.orig') as fp_in: with open('/tmp/messages','w') as fp_out: for line in fp_in: fp_out.write(line)
[root@kauai ~]# wc -l /var/log/messages.orig 10544 /var/log/messages.orig [root@kauai ~]# wc -l /tmp/messages 10544 /tmp/messages
There you go, both source and destination files synced immediately.