Plotting bar charts in R with ggplot2

I was introduced to plotting and exploring data in R during the online Coursera Data Science course. We covered the base plotting system, lattice plot and ggplot2 amongst others. I liked the look of ggplot2 as it allows customisation of figures. I would like to use ggplot2 more often as this is the best way to learn, but I need to grasp the basic syntax first. The following is a basic introduction to making bar charts with ggplot2.

First, set up the R working environment and load the InsectSprays dataset, which contains counts of insects following treatment with different insecticides. Get the sum of all insects for each of the five spray categories and plot as a bar chart:

Draw a simple bar plot

suppressWarnings(require(ggplot2))

# read in data
df <- InsectSprays

# get sum of all insects by spray
df2 <- aggregate(count ~ spray, df, sum)

# plot as a bar chart
p <- ggplot(df2, aes(x=spray, y=count)) + geom_bar(stat="identity")
p

Continue reading

Posted in R | Tagged , | Leave a comment

Printing the date and time in a Perl script

I find it useful to print out the data and time after each step when running a long Perl script so that I can keep a check on the progress. If a step only takes a few seconds when it is expected to take much longer then it is a good indication of an error. The date and time also serve as a useful historical log of when a script was executed.

By adding a Perl subroutine at the end of the script, each call of the date and time can be made with a call to the subroutine, clock(). Here is an example code that prints a line of text followed by the date and time.


#!/usr/bin/perl

print "The date and time is:\n";
# call the clock subroutine
clock();

# the clock subroutine
sub clock {
my @months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my @weekDays = qw(Sun Mon Tue Wed Thu Fri Sat Sun);
my ($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek, $dayOfYear, $daylightSavings) = localtime();
my $year = 1900 + $yearOffset;
my $theTime = "$hour:$minute:$second, $weekDays[$dayOfWeek] $months[$month] $dayOfMonth, $year";
print "*** $theTime ***\n\n";
} 

The output of the script is:

The date and time is:
*** 16:13:7, Tue May 3, 2016 ***

Wrapping the date and time with three asterisks allows me to quickly search for the end of each step in a large log file.

Posted in Perl | Tagged | Leave a comment