Archive for the ‘ Linux ’ Category

Getting the URLs in your favorites or bookmarks as a plain list.

I have tons of pages that i bookmarked in my Firefox browser in a Linux box and wanted to get a simple listing of these URLs with titles.

1. Export books marks to a JSON file
2. Extract JSON file to get a simple list

1. How to Export bookmars in Firefox as JSON.
Go to Bookmarks menu
Show All Bookmarks
Import and Backup (click the down arrow to expand it)
Backup
Save (Make sure JSON is selected at the right bottom corner)

The file will be saved something like ‘bookmarks-2013-12-07.json’, the format is ‘bookmarks-yyyy-mm-dd.json’. Write down the path where you saved this file, we will need it for the next step.

2. Get a simple list out of the JSON format file

We are going to use the json module for python to load the file into a python list object and print the lines containing URLs. Make sure you set the ‘bookmarks_path’ variable to the path where you saved the bookmarks file.


#!/usr/bin/env python
'''extract a list of URLs from Firefox exported bookmars JSON file '''

import sys
import os
import json
import io

def Usage():
    print "{0} Path-to-bookmarks-file".format(sys.argv[0])
    sys.exit(1)

if len(sys.argv) < 2:
    Usage()

bookmark_file = sys.argv[1]

#Does the file exist?
if not os.path.isfile(bookmark_file):
    print "{0} not found.".format(bookmark_file)
    sys.exit(1)

# Load JSON file
fp_data = io.open(bookmark_file, encoding='utf-8')
try:
    jdata = json.load(fp_data)
except ValueError:
    print "{0} not valid JSON file".format(bookmark_file)
    sys.exit(1)
fp_data.close()


#Recursive function to get the title and URL keys from JSON file

def grab_keys(bookmarks_data, bookmarks_list=[]):
  if 'children' in bookmarks_data:
    for item in bookmarks_data['children']:
      bookmarks_list.append({'title': item.get('title', 'No title'),
                             'uri': item.get('uri', 'None')})
      grab_keys(item, bookmarks_list)
  return bookmarks_list


def main():
  mydata=grab_keys(jdata)
  for item in mydata:
    myurl = item['uri']
    if myurl.startswith('http') or myurl.startswith('ftp'):
      print item['uri'], "  ", item['title']

if __name__=="__main__":
  main()

Save this file, say as ‘get_bookmars.py’, and running it will give an output similar to the one below –

[root@localhost]# python get_bookmarks.py
https://www.google.com/ Google
https://aws.amazon.com/ Amazon Web Services, Cloud Computing: Compute, Storage, Database
http://docs.python.org/3/py-modindex.html Python Module Index รข Python v3.3.3 documentation
http://www.linuxhomenetworking.com/wiki/#.UqMjHddn21E Linux Home Networking
http://www.zytrax.com/books/dns/ DNS for Rocket Scientists - Contents
http://www.centos.org/ Centos
http://wiki.centos.org/ Wiki
http://www.centos.org/docs/6/ Documentation
http://www.centos.org/modules/newbb/ Forums

Another way of approaching the problem is to export the bookmarks as HTML file and then dump it as text file. Here I used ‘lynx’ (Install it using ‘yum install lynx’ in CentOS/RHEL/Fedora) to dump the file and grepped for the URLs –

[root@localhost]# lynx –dump bookmarks.html | egrep ‘[0-9]+\.[[:space:]]+http’
3. https://www.google.com/
4. https://aws.amazon.com/
5. http://docs.python.org/3/py-modindex.html
6. http://www.linuxhomenetworking.com/wiki/#.UqMjHddn21E
7. http://www.zytrax.com/books/dns/
9. http://www.centos.org/
10. http://wiki.centos.org/
11. http://www.centos.org/docs/6/
12. http://www.centos.org/modules/newbb/

[root@localhost]# lynx –dump bookmarks.html | egrep ‘[0-9]+\.[[:space:]]+http’ | awk ‘{print $2}’
https://www.google.com/
https://aws.amazon.com/
http://docs.python.org/3/py-modindex.html
http://www.linuxhomenetworking.com/wiki/#.UqMjHddn21E
http://www.zytrax.com/books/dns/
http://www.centos.org/
http://wiki.centos.org/
http://www.centos.org/docs/6/
http://www.centos.org/modules/newbb/

In order to use this script, you need to do certain things in advance –

1. Download youtube-dl, a script which allows you to download videos

https://github.com/rg3/youtube-dl

2. Install ffmpet: an audio/video conversion tool.
Ubuntu users can run the following commands –

  apt-get install ffmpeg libavcodec-extra-53

Note: More details can be found here.

Usage Example: –

 ./musicdownloader.sh http://www.youtube.com/watch?v=8tHu-OwzwPg BereketMengstead-mizerey.mp3
#!/bin/bash

downloader=`which youtube-dl`
ffmpeg=`which ffmpeg`
bitrate=192000

ARGC=$#
LINK=$1
FILENAME=$2
SAVEDFILE=$(basename $0)_mymusic123.mp4

if [ $ARGC -ne 2 ]; then
  echo "Usage: $(basename $0) url-link output-file"
  echo "Example: $(basename  $0)  http://www.youtube.com/watch?v=fQZNiMckKbI Azmari-ethio01.mp3"
  exit
fi

$downloader -f 18 $LINK -o $SAVEDFILE  &&  $ffmpeg -i $SAVEDFILE -f mp3 -ab $bitrate -vn $FILENAME

if [ $? -eq 0 ];
then
 echo "File saved in " $FILENAME
 rm $SAVEDFILE
fi

Recently I was trying to download numerous files from a certain website using a shell script I wrote. With in the script, first I used wget to retrieve the files, but I kept on getting the following error message –

HTTP request sent, awaiting response... 403 Forbidden
2012-12-30 06:17:45 ERROR 403: Forbidden.

Then hoping that this was just a wget problem, I replaced wget with curl. It turned out that Curl would actually create a file with the same name as the one being download, but to my surprise the file was not downloaded. Instead, it contained an html file with 403 Forbidden message.

403 Forbidden
Forbidden

You don't have permission to access /dir/names.txt on this server.

What was surprising is that I could download the files using Firefox, Internet Explorer, elinks and even text based browser ‘lynx’. It seems that the website was blocking access from client browsers with certain ‘User-Agent’ header field. So the trick was to simply modify the User-Agent to a ‘legitimate’ one. Both curl and wget support the altering of User-Agent header field. You can use below commands to change the User-Agent parameter –

USER_AGENT="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"

wget --user-agent="$USER_AGENT" -c http://linuxfreelancer.com/status.html

curl -A "$USER_AGENT" -O http://linuxfreelancer.com/status.html

In addition to wget or curl, a much easier to use CLI HTTP client httpie can be used. Passing custom HTTP headers is intuitive using httpie, installation and usage details can be accessed here. Modifying the User-Agent header using httpie is shown below –

USER_AGENT="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"
http http://linuxfreelancer.com/ "$USER_AGENT"

All commands –

USER_AGENT="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"
wget --user-agent="$USER_AGENT" -c http://linuxfreelancer.com/status.html

curl -A "$USER_AGENT" -O http://linuxfreelancer.com/status.html

http http://linuxfreelancer.com/ "$USER_AGENT"

View all posts in this blog – https://linuxfreelancer.com/all-posts

Amazon provides extensive tools to manage virtual machines hosted on Amazon web services(AWS). It is very easy to launch VMs, and a lot easier to destroy or terminate VMs! It might be unintentional in the later case, with just one mis-click and a second confirmation you could end up terminating a critical production server. There is now way of bringing back a terminated VM in AWS. Once it is gone, it is gone forever.

So what steps should you follow to prevent unintended data loss?

1. Make sure the virtual machines are properly labelled in the EC2 dashboard – under “Name”. This can be done by simply right clicking a VM, and selecting “Add/Edit Tags”. If you have so many servers without proper tags, you might unintentionally terminate the wrong server.

2. Enable Termination Protection – Right click on the VM and select “Termination Protection”. Make sure the Termination protection is Enabled. If by any chance you decide to terminate your VM, you have to disable the termination protection on this option, and then go back to the dashboard to terminate your VM.

Extract MP3 from Youtube

Download audio in mp3 format from Youtube

Got your favorite youtube video and yet you don’t have it in an audio format such as mp3 to play it offline? With open source tools, you can grab that video and convert it to mp3 at no cost.

Prepare a directory for downloading mp4 format files from youtube.

mkdir /home/youtube

Tools you need

1. youtube-dl: A python script to download videos from youtube – http://rg3.github.io/youtube-dl/download.html

Using curl –

sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl

Using wget –

sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl

Using pip –

sudo pip install --upgrade youtube_dl

2. ffmpeg: an audio/video conversion tool.

 apt-get install ffmpeg libavcodec-extra-53

Procedure
1. Make the youtube link ready for download and then download the mp4 using youtube-pl script
2. Use ffmpeg tools to convert mp4 to mp3

Sample download

Let us download the following youtube link

youtube-dl -f 18 -t http://www.youtube.com/watch?v=dAG2qxvYwsY

options: -f is for file format of the youtube video (check youtube-dl documentation for the whole list)

Next, convert it to mp3

ffmpeg -i Freselam_Mussie_s_Tsinih_Zeytibli-dAG2qxvYwsY.mp4 -f mp3 -ab 192000 -vn Freselam_Mussie_s_Tsinih_Zeytibli-dAG2qxvYwsY.mp3

options: -i for input
         -f for output file format
         -ab for bit rate
         -vn for Disable video recording.

References –

http://rg3.github.com/youtube-dl/

http://rg3.github.io/youtube-dl/download.html

Besides a website, the server running this blog also hosts an Internet Music broadcasting radio. Do you see the “Listen Music” Link on the home page, top right corner of the page – http://danasmera.com:8000/listen.pls?sid=1 ? It is running on an AWS ec2 microinstance, which does not cost much. So how do you turn your public facing server into an internet radio, accessible from your pc, laptop or mobile phone. It is quite simple, some of the most popular solutions are Ice cast and shoutcast. Here is how you can setup an Internet broadcast radio using shoutcast.

1. Add shoutcast user

#useradd shoutcast or
#adduser shoutcast

cd /home/shoutcast

2. Download shoutcast

Go to http://www.shoutcast.com/broadcast-tools and download SHOUTcast Distributed Network Audio Server(DNAS).

#wget -c http://download.nullsoft.com/shoutcast/tools/sc_serv2_linux_x64_07_31_2011.tar.gz  

(for 64-bit linux machine)

#wget -c http://download.nullsoft.com/shoutcast/tools/sc_serv2_linux_07_31_2011.tar.gz

(for 32-bit linux machine)

If you plan to broadcast mp3 format, you will need the SHOUTcast Transcoder (SC_TRANS)

#wget -c http://download.nullsoft.com/shoutcast/tools/sc_trans_linux_x64_10_07_2011.tar.gz  

(for 64-bit linux machine)

#wget -c http://download.nullsoft.com/shoutcast/tools/sc_trans_linux_10_07_2011.tar.gz 

(for 32-bit linux machine)

3. uncompress and untar the shoutcast programs (In my case, it is the 64-bit version)

#tar xzvf sc_serv2_linux_x64_07_31_2011.tar.gz
#tar xzvf http://download.nullsoft.com/shoutcast/tools/sc_trans_linux_x64_10_07_2011.tar.gz

4. Time to edit two important config files: sc_serv_basic.conf and sc_trans_basic.conf

a. sc_serv_basic.conf

logfile=logs/sc_serv.log
w3clog=logs/sc_w3c.log
banfile=control/sc_serv.ban
ripfile=control/sc_serv.rip
publicserver=always</code>
<code>password=yourpasswordhere</code> #this password is used by sc_trans, make sure to use same password in sc_trans_basic.conf
<code>adminpassword=yourpasswordhereagain</code> #this password is used to access the admin page through your browser
<code>streamid=1
streampath=/test.aac

streamauthhash_1=AcMnKLMrYVmK2NlR9W8j #unique for each station, Needed if you plan to make your station publicly available.

b. sc_trans_basic.conf

logfile=logs/sc_trans.log
calendarrewrite=0
encoder_1=aacp</code>   ## uploaded mp3 music files will be played as AAC
<code>bitrate_1=56000
outprotocol_1=3
serverip_1=127.0.0.1</code> ##listen only on loopback interface
<code>serverport_1=8000  
uvoxauth_1=yourpasswordhere </code> ## This password has to be the same as 'password' in sc_serv_basic.conf
<code>uvoxstreamid_1=1
endpointname_1=/Bob
streamtitle=Eritrean and Ethiopian Guayla
streamurl=http://danasmera.com:8000/listen.pls?sid=1
genre=Tigrigna Guayla
playlistfile=playlists/main.lst</code> ## the file contaning the path to individual music files, we will populate this later.
<code>adminport=7999
adminuser=administrator
adminpassword=yourdminpasshere

5. Upload your music files

Upload all your music files to the /home/shoutcast/music directory. Use any sftp client, such as winscp or filezilla for this task. Sample output –

root@danasmera:/home/shoutcast# ls -al /home/shoutcast/music/
-rw-r--r--  1 shoutcast shoutcast  6418432 2011-09-07 02:13 abrahamAF.mp3
-rw-r--r--  1 shoutcast shoutcast  7345261 2011-11-22 23:41 Abreham-vol2.mp3
-rw-r--r--  1 shoutcast shoutcast  6222993 2011-11-22 23:41 asmera.mp3
-rw-r--r--  1 shoutcast shoutcast  3197056 2011-09-13 02:56 Bebizelenayo.mp3
-rw-r--r--  1 shoutcast shoutcast  5890765 2011-11-22 23:41 Bereket1.mp3

6. Populate your playlists file i.e. /home/shoutcast/playlists.lst with full path of all the music files you have on the server.

a. All music files in specific directory eg. /home/shoutcast/music, assuming mp3 file format.

#find /home/shoutcast/music/ -type f -name "*.mp3" -exec ls -1  {} \; > /home/shoutcast/playlists/playlist.lst

b. Music files are locate in different directories in the server, assuming mp3 file format.

#find / -type f -name "*.mp3" -exec ls -1  {} \; > /home/shoutcast/playlists/playlist.lst

7. File permissions and firewall

a. File permissions
Make sure all files under /home/shoutcast are owned by the shoutcast user, otherwise shoutcast will encounter permission denied errors when it tries to play the files.

#chown -R shoutcast:shoutcast /home/shoutcast

b. Open port 8000 and 8001

#iptables -A INPUT -p tcp -i eth0 --dport 8000 -m state --state NEW -j ACCEPT

(-i: might be different depending on your NIC interface such as eth1, eth2 …)

#iptables -A INPUT -p tcp -i eth0 --dport 8001 -m state --state NEW -j ACCEPT

In case of Amazon ec2 servers, you need to open up port 8000 for the specific security group under which the server is running. It is accessible in AWS web management console.

8. Run shoutcast services

#cd /home/shoutcast
#./sc_serv sc_serv_basic.conf > /dev/null 2>&1 &
#./sc_trans sc_trans_basic.conf > /dev/null 2>&1 &

Test if shoutcast is listening on the specified ports using netstat

root@danasmera:/home/shoutcast# netstat -talpn |grep sc_
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      1075/sc_serv    
tcp        0      0 0.0.0.0:8001            0.0.0.0:*               LISTEN      1075/sc_serv  

9. Register your shoutcast radio with yp.shoucast.com to make it publicly available station.

Follow the instructions on this wiki on how to do this – http://wiki.winamp.com/wiki/SHOUTcast_Authhash_Management
In short – Go to your admin page eg. http://yourip-or-domain:8000/admin.cgi
Click the “Create Authhash” link, and after filling out the form, make sure the appropriate entry is added to the streamauthhash_1 parameter in your sc_serv_basic.conf file.

10. Enjoy the music!

One way to listen the music is by directly browsing to the link as in http://danasmera.com:8000/listen.pls?sid=1 or http://yourip-or-hostname:8000/listen.pls?sid=1 in its generic form. But the most convenient one is to use your mobile phone app to search for your station in the shoutcast yellow pages, and add it to your favorites list. In Android mobile phones – download “A Online Radio” app from the Market, open it and search for a keyword. In my case it could be “tigrigna” or “guayla”, that is the keyword i added when registering my station to the yellow pages. The stations pops up in the search results, just click to play it. For an iphone, you can use the ‘shoutcast’ app.

Finally, keep an eye on the log files in /home/shoutcast/logs, some of the information you will find there includes the music files played, your listeners ip addresses etc. You might use the following command for instance to sort out the IP addresses of the listeners –

#less sc_serv*  | grep -i client | awk '{print  $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr

Last but not least, know the copyright laws in your country before you start broadcasting other people’s work!

View all posts in this blog – https://linuxfreelancer.com/all-posts