Web sites store information on local machines of site visitors using cookies. On subsequent visits, the browser sends the data from the cookies on the visitors machine to the web server, which might then use that information as a historical record of the users activity on the site – on the minimum the time the cookie was created, when it is set to expire and last access time or last time user visited site. Cookies are also used by sites to ‘remember’ user acitivity , say the shopping cart items or login/session information to address the shortcomings of the stateless HTTP protocol.
Most users think that only the sites they had directly visited store cookies on their computers, in reality the number is way higher than that. A single site you visit, usually has lots of links in it, especially ads, that store cookies in your computer. In this post, i will demonstrate how to list the list of all sites that left cookies in your computer, as well as extract additional information from the cookies. When i ran the script and did a count of the 10 top sites which left largest number of entries in the cookies sqlite DB, none of them except for one or two were sites I directly visited!
This Python script was written to extract cookies information on a Linux box running Firefox. The cookies information is stored as a sqlite file and thus you will need the sqlite3 python module to read the sqlite file.
The script takes the path to the cookies file as well as the path to the output file, it will write the output to this file. It will also dump the output to the screen.
root@dnetbook:/home/daniel/python# python cookie_viewer.py cookie_viewer.py cookie-fullpath output-file root@dnetbook:/home/daniel/python# python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt
doubleclick.net,Thu Feb 11 17:56:01 2016,Thu Apr 23 20:46:58 2015,Tue Feb 11 17:56:01 2014 twitter.com,Thu Feb 11 17:56:05 2016,Tue Apr 21 22:27:46 2015,Tue Feb 11 17:56:05 2014 imrworldwide.com,Thu Feb 11 17:56:12 2016,Tue Apr 21 22:19:35 2015,Tue Feb 11 17:56:12 2014 quantserve.com,Thu Aug 13 19:32:02 2015,Thu Apr 23 20:46:57 2015,Tue Feb 11 18:32:0
The output will be the domain name of the site, cookie expiry date, access time and creation time.
Code follows –
#!/usr/bin/env python ''' Given a location to firefox cookie sqlite file Write its date param - expiry, last accessed, Creation time to a file in plain text. id baseDomain appId inBrowserElement name value host path expiry lastAccessed creationTime isSecure isHttpOnly python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt ''' import sys import os from datetime import datetime import sqlite3 def Usage(): print "{0} cookie-fullpath output-file".format(sys.argv[0]) sys.exit(1) if len(sys.argv)<3: Usage() sqldb=sys.argv[1] destfile=sys.argv[2] # Some dates in the cookies file might not be valid, or too big MAXDATE=2049840000 # cookies file must be there, most often file name is cookies.sqlite if not os.path.isfile(sqldb): Usage() # a hack - to convert the epoch times to human readable format def convert(epoch): mydate=epoch[:10] if int(mydate)>MAXDATE: mydate=str(MAXDATE) if len(epoch)>10: mytime=epoch[11:] else: mytime='0' fulldate=float(mydate+'.'+mytime) x=datetime.fromtimestamp(fulldate) return x.ctime() # Bind to the sqlite db and execute sql statements conn=sqlite3.connect(sqldb) cur=conn.cursor() try: data=cur.execute('select * from moz_cookies') except sqlite3.Error, e: print 'Error {0}:'.format(e.args[0]) sys.exit(1) mydata=data.fetchall() # Dump results to a file with open(destfile, 'w') as fp: for item in mydata: urlname=item[1] urlname=item[1] expiry=convert(str(item[8])) accessed=convert(str(item[9])) created=convert(str(item[10])) fp.writelines(urlname + ',' + expiry + ',' + accessed + ',' + created) fp.writelines('\n') # Dump to stdout as well with open(destfile) as fp: for line in fp: print line
TOP 10 sites with highest number of enties in the cookies file –
root@dnetbook:/home/daniel/python# awk -F, '{print $1}' /tmp/test.txt | sort | uniq -c | sort -nr | head -10 73 taboola.com 59 techrepublic.com 43 insightexpressai.com 34 pubmatic.com 33 2o7.net 31 rubiconproject.com 28 demdex.net 27 chango.com 26 yahoo.com 26 optimizely.com
View all posts in this blog – https://linuxfreelancer.com/all-posts