Web automation using Perl: October 2014

Friday, 17 October 2014

Test 20: Find name of director of movie from wikipedia website

Test:

Given a wikipedia page for any movie. Find the name of director of movie.

Learning:

use of Mojo::DOM to read text of any tag
use /regex/s to consider space also, like /By director/ wont work but /By director/s will work

Solution:

use WWW::Mechanize;
use WWW::Mechanize::Link;
use Mojo::DOM;

my $m = WWW::Mechanize->new();
$m->get("http://en.wikipedia.org/wiki/Kal_Ho_Naa_Ho");

$c=$m->content();

(my $a) = $c =~ /Directed by(.*?\/a>)/s;

my $dom = Mojo::DOM->new($a);
print $dom->at('a')->text;

output snapshot:

Wednesday, 15 October 2014

Test 19: Enter a keyword(product name) in google and find the company which offers that product at minimum cost in google result

Test:

Take a keyword(Product name) from STDIN and find the prices and companies name displayed by google result. print the minimum price and company. If item not found, print - no shoppable item

Learning:

get content and use regex.

Solution:

use WWW::Mechanize;
use WWW::Mechanize::Link;
print "Enter the keyword\n";
chomp($key=<STDIN>);

my $m = WWW::Mechanize->new();
$m->get("http://www.google.com");

$m->submit_form(
form_number => 1,
fields => { q => $key},
);

$m->click_button(name=>"btnG");
$c=$m->content();

print ABC $c;

if($c !~ /Shop for .*? on Google/)
{
print "no shoppable item\n";
exit;
}

(@links)=$c =~ /(Rs\.\s.*?)<.*?wrap\">(.*?)<\/cite>/g;

$len=@links;
@price=();
for($i=0;$i<$len;$i++)
{

$links[$i] =~ s/Rs\..*?(\d)/Rs\.$1/;
print "$links[$i] $links[$i+1]" ;
$links[$i] =~ s/Rs\.//;
$links[$i] =~ s/,//;
push(@price,$links[$i]);
$i++;
print "\n";
}
@price=sort{$a<=>$b}@price;
$min=$price[0];
print "\nLowest price is $min by ";
for($i=0;$i<$len;$i++)
{
if($links[$i] =~ /$min/)
{
print "$links[$i+1]\n";
exit;
}
$i++;
}

output snapshot:

Tuesday, 14 October 2014

Test 18: Put a keyword to google and find top 5 results

Test:

Take a keyword from STDIN and print top 5 google result for that keyword

Learning:

get content and use regex. Google uses <b> for keywords

Solution:

output snapshot:

Wednesday, 8 October 2014

Test 17: Print ALT text and URL of all images in a page

Test:

Get all images of a page and print ALT text and URL

Learning:

$m->images() is used for images

Solution:

use WWW::Mechanize;

my $m = WWW::Mechanize->new();
$m->get("http://vienna.yapceurope.org/ye2007/search");
@arr=$m->images();
$count=1;
foreach(@arr)
{
print "Image $count \n";
$tmp=$_->alt;
print "ALT is $tmp\n";
$tmp=$_->url;
print "URL is $tmp\n\n";
$count++;
}

output snapshot:

Test 16: show various direct methods with Mech

Test:

Use the methods which can be directly used with $m->

Learning:

Various direct methods with $m->

Solution:

use strict;
use warnings;

use WWW::Mechanize;

my $m = WWW::Mechanize->new();
$m->get("http://www.rediff.com");

print "URL is- ".$m->uri()."\n";

print "Status code is- ".$m->status()."\n";

print "Content type is- ".$m->content_type()."\n";

print "Base URI is- ".$m->base()."\n";

print "is content HTML- ".$m->is_html()."\n";

print "title is- ".$m->title()."\n";

output snapshot:

Test 15 :Set up new user agent and verify the request is sent via that user agent

Test:

Set up new user agent and verify the request is sent via that user agent

Learning:

$mech->agent_alias( 'Windows IE 6' ); is used to set user agent.

possible UA can be:

Windows IE 6
Windows Mozilla
Mac Safari
Mac Mozilla
Linux Mozilla
Linux Konqueror

Solution:

use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
$mech->agent_alias( 'Windows IE 6' );
$mech->get('http://whatsmyuseragent.com/');
my $data = $mech->content;

(my $ua)= $data =~ /<h2 class="info">(.*)<\/h2>/;

print $ua;

output snapshot:

Test 14: Print values of form fields, update them and print new values

Test:

Get values of all fields of a form, update the values and then print new values

Learning:

$form->value used to get values

Solution:

use strict;
use warnings;

use WWW::Mechanize;

my $m = WWW::Mechanize->new();
$m->get("http://vienna.yapceurope.org/ye2007/search");

my $count = 1;
for my $form ($m->forms)
{
print "form $count fields are:\n";
$count++;
for ($form->param)
{
printf "%s - %s\n", $_, $form->value($_);
}
print "\n";
}

$m->submit_form(
form_number => 1,
fields => { name => 'honey', town => 'Delhi', country =>

'india', pm_group => 'Cam.pm'},
);

$count = 1;
print "After filling data to form\n";
for my $form ($m->forms)
{
print "form $count fields are:\n";
$count++;
for ($form->param)
{
printf "%s - %s\n", $_, $form->value($_);
}
print "\n";
}

output snapshot:

Tuesday, 7 October 2014

Test 13: Check if requested is redirected to other url

Test:

Request for a URL and check if same URL was opened or it is redirected to some other URL

Learning:

$response->request->uri; is used to get uri. It should be used with use LWP::UserAgent

Solution:

use LWP::UserAgent qw();

my $ua = LWP::UserAgent->new;
my $response = $ua->get('http://www.google.com');
$tmp=$response->request->uri;
if($tmp eq 'http://www.google.com')
{
print "no redirection\n";
}
else
{
print "redirected to $tmp\n";
}

output snapshot:

Test 12: Enter a keyword in search box, submit the form and check title of result page

Test:

Enter a value in a form of a page, submit the form and check title of resulting page

Learning:

submit_form is used to submit a form

Solution:

use WWW::Mechanize;
use LWP::UserAgent;
my $m = WWW::Mechanize->new();
$m->get( "http://en.wikipedia.org/wiki/Main_Page" );
$m->submit_form(
form_number => 1,
fields => { search => 'honey', },
button => 'go'
);
$tmp=$m->title;
print "$tmp\n";

output snapshot:

Test 11: Find fields of all forms in a page

Test:

Find fields of all forms in a page

Learning:

param is used to get fields of a form

Solution:

use WWW::Mechanize;
use LWP::UserAgent;
my $m = WWW::Mechanize->new();
$m->get( "http://www.rediff.com/" );
@arr=$m->forms;
$count=1;
foreach (@arr)
{
my @inputfields = $_->param;
print "form $count fields are:\n";
$count++;
foreach(@inputfields)
{
print "$_\n";
}
print "\n";
}

output snapshot:

Test 10: Find a url in webpage by keyword

Test:

Given a keyword, find all url having that keyword

Learning:

$m->find_all_links(
tag => "a", text_regex => qr/keyword/i ); used to find keyword in url

Solution:

use WWW::Mechanize;
my $m = WWW::Mechanize->new();
$m->get( "http://www.google.com/" );

my @links = $m->find_all_links(
tag => "a", text_regex => qr/history/i );

foreach(@links)
{
$tmp=$_->text;
print "$tmp\n";
$tmp=$_->url;
print "$tmp\n";
}

output snapshot:

Thursday, 2 October 2014

Test 9: Check that all links of webpage can be accessed successfully

Test:

Check that all links of webpage can be accessed successfully

Learning:

get( $link->url ) will take care of relative URL also

Solution:

#!/usr/bin/perl
use Test::More;
use LWP::Simple;
use WWW::Mechanize;

#define number of planned test cases
plan tests => 700;

my $m = WWW::Mechanize->new();
$m->get( "http://www.google.com/" );

ok( $m->success, 'Fetched OK' );

for my $link ( $m->links )
{
$m->get( $link->url );
print $link->text."\n".$link->url."\n";
ok( $m->success);
$m->back();
ok( $m->success, "Back successful");
print "\n";
}

output snapshot: