Transcript Slide 1

Programming and Perl
for
Bioinformatics
Part I
A Taste of Perl: print a message

perltaste.pl: Greet the entire world.
#!/usr/bin/perl
- command interpretation header
#greet the entire world
$x = 6e9;
- a comment
- variable assignment statement
print “Hello world!\n”;
print “All $x of you!\n”;
}
- function calls
(output statements)
Basic Syntax and Data Types



whitespace doesn’t matter to Perl. One can write all
statements on one line
All Perl statements end in a semicolon ; just like C
Comments begin with ‘#’ and Perl ignores everything
after the # until end of line.


Example: #this is a comment
Perl has three basic data types:



scalar
array (list)
associative array (hash)
Scalars

Scalar variables begin with ‘$’ followed by an identifier


Example: $this_is_a_scalar;
An identifier is composed of upper or lower case
letters, numbers, and underscore '_'. Identifiers are case
sensitive (like all of Perl)



$progname = “first_perl”;
$numOfStudents = 4;
= sets the content of $progname to be the string
“first_perl” & $numOfStudents to be the integer 4
Scalar Values

Numerical Values
integer:
5, “3”, 0, -307
 floating point: 6.2e9, -4022.33
 hexadecimal/octal: 0xd4f, 0477
 Binary: 0b011011

NOTE: all numerical values stored as floating-point
numbers (“double” precision)
Do the Math

Mathematical functions work pretty much as you would
expect:
4+7
6*4
43-27
256/12
2/(3-5)

4+5
9
4+5=9
Example
#!/usr/bin/perl
What
print "4+5\n";
print 4+5 , "\n";
print "4+5=" , 4+5 , "\n";
$myNumber = 88;

will be the output?
Note: use commas to separate multiple items in a print statement
Scalar Values

String values

Example:
$day = "Monday ";
print "Happy Monday!\n";
print "Happy $day!\n";
print 'Happy Monday!\n';
print 'Happy $day!\n';
Happy Monday!<newline>
Happy Monday!<newline>
Happy Monday!\n
Happy $day!\n
What will be the output?


Double-quoted: interpolates (replaces variable name/control
character with it’s value)
Single-quoted: no interpolation done (as-is)
String Manipulation
Concatenation
$dna1 = “ACTGCGTAGC”;
$dna2 = “CTTGCTAT”;

juxtapose in a string assignment or print statement
$new_dna = “$dna1$dna2”;

Use the concatenation operator ‘.’
$new_dna = $dna1
Substring
0
2
.
$dna2;
Length of the substring
$dna = “ACTGCGTAGC”;
$exon1 = substr($dna,2,5); # TGCGT
Substitution
DNA transcription: T  U
Substitution operator s/// :
$dna = “GATTACATACACTGTTCA”;
$rna = $dna;
$rna =~ s/T/U/g; #“GAUUACAUACACUGUUCA”
=~ is a binding operator indicating to exam the contents of
$rna for a match pattern
Ex: Start with $dna =“gaTtACataCACTgttca”;
and do the same as above. What will be the output?
Example

transcribe.pl:
$dna ="gaTtACataCACTgttca";
$rna = $dna;
$rna =~ s/T/U/g;
print "DNA: $dna\n";
print "RNA: $rna\n";




Does it do what you expect? If not, why not?
Patterns in substitution are case-sensitive! What can we do?
Convert all letters to upper/lower case (preferred when possible)
If we want to retain mixed case, use transliteration/translation
operator
tr///
$rna =~ tr/tT/uU/; #replace all t by u, all T by U
Case conversion
$string = “acCGtGcaTGc”;
Upper case:
$dna = uc($string); # “ACCGTGCATGC”
or $dna = uc $string;
or $dna = “\U$string”;
Lower case:
$dna = lc($string); # “accgtgcatgc”
or $dna = “\L$string”;
Sentence case:
$dna = ucfirst($string) # “Accgtgcatgc”
or $dna = “\u\L$string”;
Reverse Complement
5’- A C G T C T A G C
3’- T G C A G A T C G

. . . .
. . . .
G C A T -3’
C G T A -5’
Reverse: reverses a string
$string = "ACGTCTAGC";
$string = reverse($string); "CGATCTGCA“

Complementation: use transliteration operator
$string =~ tr/ACGT/TGCA/;
More on String Manipulation
String length:
length($dna)
Index:
#index STR,SUBSTR,POSITION
optional
index($strand, $primer, 2)
Flow Control
Conditional Statements

parts of code executed depending on truth value of a logical
statement
“truth” (logical) values in Perl:
false = {0, 0.0, 0e0, “”, undef}, default “”
true = anything else, default 1
($a, $b) = (75, 83);
if ( $a < $b ) {
$a = $b;
print “Now a = b!\n”;
}
if ( $a > $b ) { print “Yes, a > b!\n” } # Compact
Comparison Operators
Comparison
String
Number
Equality
Inequality
Greater than
Greater than or equal to
eq
ne
gt
ge
==
!=
>
>=
Less than
Less than or equal to
return 1/null
lt
le
<
<=
Comparison:
Returns -1, 0, 1
cmp
<=>
Logical Operators
Operation
Computerese
English version
AND
&&
and
OR
||
or
NOT
!
not
if/else/elsif

allows for multiple branching/outcomes
$a = rand();
if ( $a <0.25 ) {
print “A”;
}
elsif ($a <0.50 ) {
print “C”;
}
elsif ( $a < 0.75 ) {
print “G”;
}
else {
print “T”;
}
Conditional Loops
while ( statement ) { commands … }

repeats commands until statement is no longer true
do { commands } while ( statement );


same as while, except commands executed as least once
NOTE the ‘;’ after the while statement!!
Short-circuiting commands: next and last


next;
last;
#jumps to end, do next iteration
#jumps out of the loop completely
while
Example:
while ($alive) {
if ($needs_nutrients) {
print “Cell needs nutrients\n”;
}
}
Any problem?
for and foreach loops


Execute a code loop a specified number of times, or for
a specified list of values
for and foreach are identical: use whichever you want
Incremental loop (“C style”):
for ( $i=0 ; $i < 50 ; $i++ ) {
$x = $i*$i;
print "$i squared is $x.\n";
}
Loop over list (“foreach” loop):
foreach $name ( "Billy", "Bob", "Edwina" ) {
print "$name is my friend.\n";
}
Basic Data Types
 Perl
has three basic data types:
 scalar
 array
(list)
 associative array (hash)