Archive for the ‘ Linux ’ Category

The date command in Linux boxes is one of the most powerful open source utilities. It is not just for setting the clock on your PC or server, or showing you what the current time is, it can do amazingly more. It can virtually answer all of your chronological questions.

The simplest use case of date command is to view current time, possibly in different time formats –

$ date
Sat Dec 17 00:45:35 EST 2016

$ date '+%Y-%m-%d'
2016-12-17

$ date '+%c'
Sat 17 Dec 2016 12:45:51 AM EST

It is useful in converting time to/from epoch as well –

$ date '+%s'
1481953669

$ date --date='@1481953669'
Sat Dec 17 00:47:49 EST 2016

The most user friendly use case of the date command is the ‘-d’ or ‘–date’ options, which accepts free format human readable date string such as “yesterday”, “last week”, “next year”, “3 min ago”, “last friday + 2 hours” etc. Here is an excerpt from the man page of the GNU date command –

DATE STRING
The --date=STRING is a mostly free format human readable date string such as "Sun, 29 Feb 2004 16:21:42 -0800" or "2004-02-29 16:21:42" or even "next Thursday". A date string may con?
tain items indicating calendar date, time of day, time zone, day of week, relative time, relative date, and numbers. An empty string indicates the beginning of the day. The date
string format is more complex than is easily documented here but is fully described in the info documentation.

Let us play with it –

$ date -d '2 hours ago'
Fri Dec 16 22:51:25 EST 2016

$ date -d '2 hours ago' '+%c'
Fri 16 Dec 2016 10:51:30 PM EST

$ env TZ=America/Los_Angeles date -d '2 hours ago' '+%c'
Fri 16 Dec 2016 07:52:33 PM PST

$ date -d 'jan 2 1990'
Tue Jan  2 00:00:00 EST 1990

$ date -d 'yesterday'
Fri Dec 16 00:53:04 EST 2016

$ date -d 'next year + 2 weeks'
Sun Dec 31 00:53:27 EST 2017

To give a practical example, let us use the date command to get, on which day all the birth days of someone fall, given their date of birth. This can be for past birth days as well as the future. For this example, we will do it from date of birth to this date. Let us pick someone who was born on Feb 29, 1988. This is an edge case. The date command should be smart enough to figure out the leap years.

for year in {1988..2016}; do 
  date -d "feb 29 $year" &>/dev/null
  if [ $? -eq 0 ]; then
    echo -n "Year: $year   " ; date -d "feb 29 $year" '+%c'
  fi
done

Year: 1988   Mon 29 Feb 1988 12:00:00 AM EST
Year: 1992   Sat 29 Feb 1992 12:00:00 AM EST
Year: 1996   Thu 29 Feb 1996 12:00:00 AM EST
Year: 2000   Tue 29 Feb 2000 12:00:00 AM EST
Year: 2004   Sun 29 Feb 2004 12:00:00 AM EST
Year: 2008   Fri 29 Feb 2008 12:00:00 AM EST
Year: 2012   Wed 29 Feb 2012 12:00:00 AM EST
Year: 2016   Mon 29 Feb 2016 12:00:00 AM EST

A typical case would be, say for someone born on Jan 8 1990 –

age=0
for year in {1990..2016}; do 
  echo -n "Age: $age  "; date -d "Jan 8 $year" '+%A %d %B %Y'
  age=$((age+1))
done

Age: 0  Monday 08 January 1990
Age: 1  Tuesday 08 January 1991
Age: 2  Wednesday 08 January 1992
Age: 3  Friday 08 January 1993
Age: 4  Saturday 08 January 1994
Age: 5  Sunday 08 January 1995
Age: 6  Monday 08 January 1996
Age: 7  Wednesday 08 January 1997
Age: 8  Thursday 08 January 1998
Age: 9  Friday 08 January 1999
Age: 10  Saturday 08 January 2000
Age: 11  Monday 08 January 2001
Age: 12  Tuesday 08 January 2002
Age: 13  Wednesday 08 January 2003
Age: 14  Thursday 08 January 2004
Age: 15  Saturday 08 January 2005
Age: 16  Sunday 08 January 2006
Age: 17  Monday 08 January 2007
Age: 18  Tuesday 08 January 2008
Age: 19  Thursday 08 January 2009
Age: 20  Friday 08 January 2010
Age: 21  Saturday 08 January 2011
Age: 22  Sunday 08 January 2012
Age: 23  Tuesday 08 January 2013
Age: 24  Wednesday 08 January 2014
Age: 25  Thursday 08 January 2015
Age: 26  Friday 08 January 2016

How do you find out the number of CPU cores available in your Linux system? Here are a number of way, pick the one which works for you –

1. nproc command –

[daniel@kauai tmp]$ nproc
2

2. /proc/cpuinfo

[daniel@kauai tmp]$ grep proc /proc/cpuinfo 
processor	: 0
processor	: 1

3. top – run top command and press ‘1’ (number 1), you will see the list of cores at the top, right below tasks.

Cpu0 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 2.7%us, 1.0%sy, 0.0%ni, 96.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

4. lscpu – display information about the CPU architecture. Count Sockets times Core(s) per socket, in this case 2 x 1=2 –

[daniel@kauai tmp]$ lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             AuthenticAMD
CPU family:            16
Model:                 6
Model name:            AMD Athlon(tm) II X2 250 Processor
Stepping:              3
CPU MHz:               3000.000
BogoMIPS:              6027.19
Virtualization:        AMD-V
L1d cache:             64K
L1i cache:             64K
L2 cache:              1024K
NUMA node0 CPU(s):     0,1

5. Kernel threads – pick one of the kernel house keeping threads, such as “migration” or “watchdog” and see on how many cores it is running –

[daniel@kauai tmp]$ ps aux |grep '[m]igration'
root         3  0.0  0.0      0     0 ?        S    Dec09   0:02 [migration/0]
root         7  0.0  0.0      0     0 ?        S    Dec09   0:02 [migration/1]

[daniel@kauai tmp]$ ps aux |grep '[w]atchdog'
root         6  0.0  0.0      0     0 ?        S    Dec09   0:00 [watchdog/0]
root        10  0.0  0.0      0     0 ?        S    Dec09   0:00 [watchdog/1]

The Shell has environment variables which determine its behavior. Exported environment variables are also popular ways of making an application change its behavior. These environment variables can be loaded or ‘sourced’ using the source builtin command or ‘.’ notation. In this post, I will share a particular problem I encountered while sourcing environment variables I saved in a .envrc file.

Problem – sourcing .envrc was not loading the right environment variables, including ‘. envrc’. Renaming .envrc to any other file works though.

[daniel@kauai tmp]$ cat .envrc 
NAME='Jhon Doe'
[daniel@kauai tmp]$ source .envrc 
[daniel@kauai tmp]$ echo $NAME
Alice Bob

As you can see, the variable NAME was set to ‘Jhon Doe’ and yet after sourcing .envrc, NAME is showing ‘Alice Bob’! Renaming the file seems to resolve the issue –

[daniel@kauai tmp]$ source .envrcs 
[daniel@kauai tmp]$ echo $NAME
Jhon Doe

Troubleshooting using strace – I followed the tips on ‘is-it-possible-to-strace-the-builtin-commands-to-bash’ to strace ‘source’. Stracing shell builtins is not straight forward. After looking at the output, I found out that the shell builtin source was actually reading the .envrc from a different directory, not my current working directory! The directory it was sourcing from was one of the directories in $PATH environment variables.

Read the man pages – Looking at the man page for bash, under the section for source command –

source filename [arguments]
Read and execute commands from filename in the current shell environment and return the exit status of the last command executed from filename. If filename does not contain a slash, file names in PATH are used to find the directory containing filename. The file searched for in PATH need not be executable. When bash is not in posix mode, the current directory is searched if no file is found in PATH. If the sourcepath option to the shopt builtin command is turned off, the PATH is not searched. If any arguments are supplied, they become the positional parameters when filename is executed. Otherwise the positional parameters are unchanged. The return status is the status of the last command exited within the script (0 if no commands are executed), and false if filename is not found or cannot be read.

Apparently this is an expected behavior. If I hadn’t a .envrc file in one of the $PATH directories, this would have been fine. In this case, there are several solutions –

1. Remove .envrc from $PATH directories [not the best option ]
2. Rename .envrc to a different file [ not ideal either ]
3. When sourcing the file, use absolute path [ good practice ]

[daniel@kauai tmp]$ echo $NAME
Alice Bob
[daniel@kauai tmp]$ pwd
/tmp
[daniel@kauai tmp]$ source /tmp/.envrc 
[daniel@kauai tmp]$ echo $NAME
Jhon Doe

4. When sourcing the file, add slash(‘/’), for instance – source ./.envrc [ good practice ]
5. Disable sourcepath shell option [ not bad idea ]

[daniel@kauai tmp]$ shopt sourcepath
sourcepath     	on
[daniel@kauai tmp]$ shopt -u sourcepath
[daniel@kauai tmp]$ source .envrc 
[daniel@kauai tmp]$ echo $NAME
Jhon Doe

Bottom line – Make sure you understand how environment variables sourcing or loading works in bash and you follow good practices, so that you won’t wast precious hours trying to figure out why your script is behaving strangely.

Sooner or later, you will find yourself adding sensitive data into Ansible playbooks, host or group vars files.Such information might include MySQL DB credentials, AWS secret keys, API credentials etc. Including such sensitive information in plain text might not be acceptable for security compliance reasons or even lead to your systems being owned when your company hires a third party to do pen testing and worst yet by outside hackers. In addition to this, sharing such playbooks to public repositories such as github won’t be easy as you have to manually search and redact all the sensitive information from all your playbooks, and as we know manual procedure is not always error prone. You might ‘forget’ to remove some of the paswords.

One solution for this is a password vault to hold all your sensitive data, and Ansible provides a utitility called ansible-vault to create this encrypted file and the data can be extracted when running your playbooks with a single option. This is equivalent to Chef’s data bag.

In this blog post, I will share with you how to use a secret key file to protect sensitive data in Ansible with ansible-vault utility. The simplest use case is to protect the encrypted file with a password or passphrase, but that is not convinient as you have to type the password everytime you run a playbook and is not as strong as a key file with hundreds or thousands of random characters. Thus the steps below describe only the procedure for setting up a secret key file rather than a password protected encrypted file. Let us get started.

The first step is to generate a key file containing a random list of characters –

#openssl rand -base64 512 |xargs > /opt/ansible/vaultkey

Create or initialize the vault with the key file generated above –

#ansible-vault create --vault-password-file=/opt/ansible/vaultkey /opt/ansible/lamp/group_vars/dbservers.yml

Populate your vault, refer to Ansible documentation on the format of the vault file –

#ansible-vault edit --vault-password-file=/opt/ansible/vaultkey /opt/ansible/lamp/group_vars/dbservers.yml

You can view the contents by replacing ‘edit’ with ‘view’ –

#ansible-vault view --vault-password-file=/opt/ansible/vaultkey /opt/ansible/lamp/group_vars/dbservers.yml

That is it, you have a secret key file to protect and encrypt a YAML file containing all your sensitive variables to be used in your ansible playbooks.

There comes a time though when you have to change the secret key file, say an admin leaves the company after winning the Mega jackbot lottery 🙂 We have to generate a new key file and rekey the encrypted file as soon as possible –

Generate a new key file –

#openssl rand -base64 512 |xargs > /opt/ansible/vaultkey.new

Rekey to new key file –

#ansible-vault rekey --new-vault-password-file=/opt/ansible/vaultkey.new --vault-password-file=/opt/ansible/vaultkey
Rekey successful

Verify –

#ansible-vault view --vault-password-file=/opt/ansible/vaultkey.new /opt/ansible/lamp/group_vars/dbservers.yml

Last but not least, make sure the secret key file is well protected and is readable only by the owner.

#chmod 600 /opt/ansible/vaultkey.new

Finally, you can use the vault with ansible-playbook. In this case, I am running it against site.yml which is a master playbook to setup a LAMP cluster in AWS (pulling the AWS instances using ec2.py dynamic inventory script) –

#ansible-playbook -i /usr/local/bin/ec2.py site.yml --vault-password-file /opt/ansible/vaultkey.new

Web sites store information on local machines of site visitors using cookies. On subsequent visits, the browser sends the data from the cookies on the visitors machine to the web server, which might then use that information as a historical record of the users activity on the site – on the minimum the time the cookie was created, when it is set to expire and last access time or last time user visited site. Cookies are also used by sites to ‘remember’ user acitivity , say the shopping cart items or login/session information to address the shortcomings of the stateless HTTP protocol.

Most users think that only the sites they had directly visited store cookies on their computers, in reality the number is way higher than that. A single site you visit, usually has lots of links in it, especially ads, that store cookies in your computer. In this post, i will demonstrate how to list the list of all sites that left cookies in your computer, as well as extract additional information from the cookies. When i ran the script and did a count of the 10 top sites which left largest number of entries in the cookies sqlite DB, none of them except for one or two were sites I directly visited!

This Python script was written to extract cookies information on a Linux box running Firefox. The cookies information is stored as a sqlite file and thus you will need the sqlite3 python module to read the sqlite file.

The script takes the path to the cookies file as well as the path to the output file, it will write the output to this file. It will also dump the output to the screen.

root@dnetbook:/home/daniel/python# python cookie_viewer.py 
cookie_viewer.py cookie-fullpath output-file

root@dnetbook:/home/daniel/python# python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt
doubleclick.net,Thu Feb 11 17:56:01 2016,Thu Apr 23 20:46:58 2015,Tue Feb 11 17:56:01 2014
twitter.com,Thu Feb 11 17:56:05 2016,Tue Apr 21 22:27:46 2015,Tue Feb 11 17:56:05 2014
imrworldwide.com,Thu Feb 11 17:56:12 2016,Tue Apr 21 22:19:35 2015,Tue Feb 11 17:56:12 2014
quantserve.com,Thu Aug 13 19:32:02 2015,Thu Apr 23 20:46:57 2015,Tue Feb 11 18:32:0

The output will be the domain name of the site, cookie expiry date, access time and creation time.

Code follows –

#!/usr/bin/env python

''' Given a location to firefox cookie sqlite file
    Write its date param - expiry, last accessed,
    Creation time to a file in plain text.
    id
    baseDomain
    appId
    inBrowserElement
    name
    value
    host
    path
    expiry
    lastAccessed
    creationTime
    isSecure
    isHttpOnly
    python /home/daniel/python/cookie_viewer.py $(find /home/daniel/ -type f -name 'cookies.sqlite' | head -1) /tmp/test.txt 
'''

import sys
import os
from datetime import datetime
import sqlite3

def Usage():
    print "{0} cookie-fullpath output-file".format(sys.argv[0])
    sys.exit(1)

if len(sys.argv)<3:
    Usage()

sqldb=sys.argv[1]
destfile=sys.argv[2]
# Some dates in the cookies file might not be valid, or too big
MAXDATE=2049840000

# cookies file must be there, most often file name is cookies.sqlite
if not os.path.isfile(sqldb):
    Usage()

# a hack - to convert the epoch times to human readable format
def convert(epoch):
    mydate=epoch[:10]
    if int(mydate)>MAXDATE:
        mydate=str(MAXDATE)
    if len(epoch)>10:
        mytime=epoch[11:]
    else:
        mytime='0'
    fulldate=float(mydate+'.'+mytime)
    x=datetime.fromtimestamp(fulldate)
    return x.ctime()

# Bind to the sqlite db and execute sql statements
conn=sqlite3.connect(sqldb)
cur=conn.cursor()
try:
    data=cur.execute('select * from moz_cookies')
except sqlite3.Error, e:
    print 'Error {0}:'.format(e.args[0])
    sys.exit(1)
mydata=data.fetchall()

# Dump results to a file
with open(destfile, 'w') as fp:
    for item in mydata:
        urlname=item[1]
        urlname=item[1]
        expiry=convert(str(item[8]))
        accessed=convert(str(item[9]))
        created=convert(str(item[10]))
        fp.writelines(urlname + ',' + expiry + ',' + accessed + ',' + created)
        fp.writelines('\n')

# Dump to stdout as well
with open(destfile) as fp:
    for line in fp:
        print line

TOP 10 sites with highest number of enties in the cookies file –

root@dnetbook:/home/daniel/python# awk -F, '{print $1}' /tmp/test.txt  | sort | uniq -c | sort -nr | head -10
     73 taboola.com
     59 techrepublic.com
     43 insightexpressai.com
     34 pubmatic.com
     33 2o7.net
     31 rubiconproject.com
     28 demdex.net
     27 chango.com
     26 yahoo.com
     26 optimizely.com

View all posts in this blog – https://linuxfreelancer.com/all-posts

In Python, you can read from and write to files without import any modules. Python has built-in function “open” which can be used to view and manipulate file objects. Let us see two ways of opening a file for reading/writing, for instance –

   fp_in = open('/etc/hosts', 'r')  # default is 'r', we can omit it.
   fp_out = open('/tmp/hosts', 'w')
   for line in fp_in:
       fp_out.write(line)

   fp_in.close()
   fp_out.close()


   with open('/etc/hosts') as fp_in:
       with open('/tmp/hosts') as fp_out:
       for line in fp_in:
           fp_out.write(line)
   # No need to close file, it is automatically closed at end of block.

One of the most common reasons given why you have to close the file object in the first case is to free up resources. But there is a second reason why you should always use ‘with’ keyword. After writing to a file object, and before closing it, the whole content from the source file might not appear in the destination file. This is because write uses buffering, and the changes will not be reflected until you run flush() or close() on the file object. Here is the help page for ‘write’ –

write(...)
    write(str) -> None.  Write string str to file.
    
    Note that due to buffering, flush() or close() may be needed before
    the file on disk reflects the data written.

Let me demonstrate this by copying the /var/log/messages file to /tmp/message, the bigger the file, the more likely you will witness the effect of buffering. First i will take a copy of /var/log/messages to /var/log/messages.orig, and work with messages.orig as the former will most likely change in size as work along.

[root@kauai ~]# wc -l /var/log/messages.orig 
10544 /var/log/messages.orig

[root@kauai ~]# wc -l /tmp/messages 
10542 /tmp/messages
[root@kauai ~]# tail -1 /tmp/messages 
Nov 16 02:36:02 kauai syslog-ng[1605]: Log statistics; processed='src.internal(s_sys[root@kauai ~]# 

[root@kauai ~]# tail -1 /var/log/messages
Nov 16 02:46:02 kauai syslog-ng[1605]: Log statistics; processed='src.internal(s_sys#2)=1787', stamp='src.internal(s_sys#2)=1416123362', processed='source(s_name_servers)=0', processed='destination(d_mesg)=7693', processed='destination(d_auth)=210', processed='source(s_sys)=12643', processed='global(payload_reallocs)=3568', processed='destination(d_mail)=12', processed='destination(d_kern)=5176', processed='destination(d_mlal)=0', processed='destination(d_ns_filtered)=0', processed='global(msg_clones)=0', processed='destination(d_spol)=0', processed='destination(hosts)=12643', processed='destination(d_boot)=0', processed='global(sdata_updates)=0', processed='center(received)=0', processed='destination(d_cron)=3653', processed='center(queued)=0'

Notice how the destination file /tmp/messages got truncated, it doesn’t even have a newline character at the end.

fp_out.close()
[root@kauai ~]# wc -l /tmp/messages 
10544 /tmp/messages

[root@kauai ~]# tail -1 /var/log/messages
Nov 16 02:56:02 kauai syslog-ng[1605]: Log statistics; processed='src.internal(s_sys#2)=1788', stamp='src.internal(s_sys#2)=1416123962', processed='source(s_name_servers)=0', processed='destination(d_mesg)=7694', processed='destination(d_auth)=211', processed='source(s_sys)=12646', processed='global(payload_reallocs)=3570', processed='destination(d_mail)=12', processed='destination(d_kern)=5176', processed='destination(d_mlal)=0', processed='destination(d_ns_filtered)=0', processed='global(msg_clones)=0', processed='destination(d_spol)=0', processed='destination(hosts)=12646', processed='destination(d_boot)=0', processed='global(sdata_updates)=0', processed='center(received)=0', processed='destination(d_cron)=3654', processed='center(queued)=0'

This problem would not have happened if we had used the ‘with’ keyword, as it automatically does the flush() and close() for us at the end of the block statement –

    with open('/var/log/messages.orig') as fp_in:
    with open('/tmp/messages','w') as fp_out:
        for line in fp_in:
            fp_out.write(line)

[root@kauai ~]# wc -l /var/log/messages.orig 
10544 /var/log/messages.orig
[root@kauai ~]# wc -l /tmp/messages 
10544 /tmp/messages

There you go, both source and destination files synced immediately.