Posts

Linux CSV Command Line Tool XSV

Image
In this post, I will introduce an excellent command line tool XSV for the CSV file parsing and data manipulation. You can install the tool from github https://github.com/BurntSushi/xsv Let us begin the tutorial. If you are on Centos, following installation commands will work. yum install cargo cargo install xsv By default cargo installs it in /root/.cargo/bin/xsv If you are not root, you might have to set either alias or add the tool in your bash PATH. alias xsv=/root/.cargo/bin/xsv As an example, I would look at the stocks data which I downloaded from Yahoo Finance for the Apple stock. Ok let us first look at our data. Read Csv file using Xsv Ok, now let us try using the Xsv command "xsv table". Xsv table would show the data in nice tabular form. Instead of Linux 'head -2' command, we can also use Xsv slice command. Xsv slice command takes index number to display a particular row. Let us say, we want to print the 2nd row. Note index starts from 0 which means data at

How To Install R and R Studio Server On Centos

Image
How To Install R and R Studio Server On Centos R is extensively used for data processing and analyzing. R has gained lot of popularity over the last few years because of data explosion over the mobile and web applications. To leverage the power of R and its eco system, one needs to have complete R suite of tools installed. Although there is large community of R developers and system administrators, I couldn't find a good resource where I could find everything about installing R and its tools in simple easy steps. That's why I decided to write this post. In this post, I will talk about installing following... R R Studio Server R Studio Connect Install R Please run following two commands to install R. sudo yum install epel-release sudo yum install R Type R -verson in your bash shell. You should see following output depending upon what version of R you have. To Bring up the R repl. Just type R and you should have your R shell started. To install any pa

How To Crawl Coupon Sites With Python

Image
In this post, I will show you how to use Python and LXML to crawl coupons and deals from coupon sites. The purpose of this post is to help users write crawlers with Python. To demo this, I will crawl coupons from couponannie.com and couponmonk.us. Example 1 Let us start with couponannie.com first. Let us first import the following two libraries.. import requests import lxml.html Most of the coupon sites have thousands of coupon pages. Most of the times, these pages are per company or site. These pages are structured templates. Therefore when we write a crawler for one coupon page, then it should work for all the coupon pages. In the case of couponannie also, this is the case. Let us pick the following url couponannie.com/stores/linkfool and extract the coupons and its related information. url = 'https://www.couponannie.com/stores/linkfool' We will use requests to get the content of above page as shown below. obj = requests.get(url) Let us convert the data in to