1. Perl
  2. Trivia

Trivia - Useful to know

We'll add a lot of things that may make Perl easier to use and read if you know it.

BEGIN

BEGIN means that it is executed at compile time.

BEGIN {
  # Statement you want to execute when compiling
}

require

require is similar to use, except that it loads the module at runtime and does not automatically execute the import method.

use File::Basename 'basename';

# Same meaning as above
BEGIN {
  require File::Basename;
  File::Basename->import('basename');
}

If you want to load the module dynamically, you may use require, but for general use it will be easier to unify with use.

local

local is not meant to generate a local variable. local is for temporarily changing the value and restoring it when it goes out of scope. For some reason it cannot be used for the scalar variable itself declared by my. You can use a package variable and hash elements, array elements, and so on.

our $num = 1;
{
  # Temporary change
  local $num = 2;
}

# Return to 1
print $foo;

my $scores = {math => 90, english => 100};
{
  # Temporary change
  local $scores->{math} = 80;
}

# Return to 90
print $scores->{math};

However, local is not used unless it has a very special purpose. You can do it with my in almost all situations, so first seriously consider whether you can do it with my.

Please refer to the following for a detailed explanation of local.

Multi - line comment

# There is no multi-line comment syntax, so it's a bit disappointing, but the comment is only . If you want to comment on multiple lines while debugging, you can comment out by interpreting it as a document.

=pod

my $str = 'aaa';
my $foo = 'foo';

=cut

The part enclosed as "=pod" and "=cut" is interpreted as a document and ignored.

Different variable with different sigils

Different sigils are completely different variable.

# arrangement
my @ids;

# Scalar
my $ids;

The above variable has the name ids, but it's a completely different variable.

Examine the meaning of predefined variable

Predefined variable are very difficult to find by google search. It is also troublesome to search from the document. You can use the "-v" option of the perldoc command to find out the meaning of predefined variable, albeit in English.

perldoc -v $.

Method call

In Perl you can use->to make method calls.

# Method call
SomeClass->method('a', 'b');
$obj->method('a', 'b');

A method call differs from a function call in that the value to the left of->is passed as the first argument of the subroutine. In the above example, the first argument of method is the string SomeClass or $obj. 'a' and'b' are the second and third arguments.

The recipient describes it as follows.

# For class methods
sub method {
  my ($class, $arg1, $arg2) = @_;
}

# For object methods
sub method {
  my ($self, $arg1, $arg2) = @_;
}

Mini perldoc guide

You can see the Perl documentation with the perldoc command.

perldoc perlfunc

You can also see the module documentation.

perldoc File::Basename

If you want to see the contents of the module, specify the m option.

perldoc -m File::Basename

If you want to know the module path, specify the l option.

perldoc -l File::Basename

If you want to see the builtin functions documentation, specify the f option.

perldoc -f substr

It's easier to see if you redirect and drop it in a file.

perldoc File::Basename> File-Basename.txt

Anonymous subroutine

In Perl you can create anonymous subroutine by using sub {}.

my $twice = sub {
  my $num = shift;

  return $num * 2;
}

To do this:

my $result = $twice->(5);

Omission of {} in dereference

The dereference symbol {} can be omitted.

my @array = @$array_ref;

# Same meaning as below
my @array = @{$array_ref};

{} Is required if it is not a simple variable.

my @array = @{$var->nums};

How to read the document of the file operator

I often use file operators such as -s and -f, but it's hard to figure out how to look at the documentation. At the command line:

perldoc -f -X

Character substitution tr

You may sometimes see tr, but the tr operator does string substitutions. The following sentence replaces "a ⇒ 1", "b ⇒ 2", and "c ⇒ 3".

$str =~ tr/abc/123/;

See the settings in Perl itself

With command line arguments

perl -V

will do.

Hash slice of hash reference

The notation of dereferencing a hash reference and hash slicing is very hard to read.

my ($key1, $key2, $key3) = @{$hash}{qw/key1 key2 key3 /};

I think it is better to substitute one by one if possible.

my $key1 = $hash->{key1};
my $key2 = $hash->{key2};
my $key3 = $hash->{key3};

How to use eval

eval {execution statement; 1;} or die "Exception";

I often see the description, but it is a little difficult to read. The meaning of this is that eval returns the result of the last executable statement if the inside of {} is normal. That is, 1 is returned. In this case, the right side of or is not executed. Returns undef on failure. In this case, the right-hand side is executed. The reason for writing 1; is that if the return value of executable statement matches 0, the right-hand side will be executed even if it succeeds.

eval {execution statement};
die "Exception" if $@;

I prefer.

Frequently used predefined variable

  • @_ - Subroutine arguments
  • @ARGV - command line arguments
  • $. - Line number when reading a file
  • $0-script name
  • $1-The part that matches the parentheses () in the regular expression. $2 and $3 have the same meaning
  • $! - OS error message
  • Exception message when catching an exception with $@- eval
  • $?-Child process status

Default argument of shift function

If no argument is specified in the shift function, the argument will be the command line argument (@ARGV) if it is the top level, and the subroutine argument (@_) if it is in the subroutine. A single shift is often used.

# Top level
my $num = shift;

# Same meaning as below
my $num = shift @ARGV;
# In a subroutine
sub {
  my $num = shift;
  
  # Same meaning as below
  my $num = shift @_;
}

Exit status of external process

When an external process is called with the system function etc., the exit status of the external process is in the upper 8 bits of the predefined variable $?. If the command call itself fails, $? Is set to -1. So, to check that the exit status is normal (normal if 0), do the following:vinegar.

my $command = "ls -l";
system $command;

if ($? == -1) {
  die "failed to execute:$!\n";
}
elsif ($? >> 8 != 0) {
  die "Return error status\n";
}

Perl's ithread implementation isn't very good

Perl has a lightweight thread implementation called ithread, but it's rumored that the implementation isn't very good. Also, the CPAN module wasn't created with thread safety in mind, so you'll run into unexpected bugs. It also does not support native threads. It is safer to give up writing in Perl for the process of automatically switching threads.

If you want to switch threads yourself, you can use the coroutine Coro.

Classic Perl

I don't use it when writing, but you may come across classical grammar when reading.

The vars pragma has the same effect as declared in our.

Same as # our $VAR;
use vars 'VAR';

Operators starting with q that you want to remember

# Operators starting with q are difficult to remember, so I will summarize them. You can also use //, {}, #, ||, etc. as the enclosing characters.

q is a single quote operator. It is the same as single quote except that single quote can be used.

my $str = q/aaa'''bbb/;

qq is a double quote operator. Same as double quotes except that double quotes can be used.

my $str = qq/aaa "" "$str/;

qw is a string list operator. You can easily write a list of strings.

# Same as ('foo', 'bar', 'baz')
my $str = qw/foo bar baz/</pre>

qr is a regular expression reference.
<pre>
my $regex_ref = qr/^aaa. * Bbb $/ ms;

1; is required at the end of the module

1; is required at the end of the module. It doesn't have to be 1 but can be any true value, but usually 1; is used.

package SomeModule;

# Mounting

1;

No semicolon is required at the end of the

block. No return is required if the evaluation is final.

The semicolon may or may not be present at the end of the block.

sub func1 {
  my $arg = shift;

  # This semicolon may or may not be present.
  return $arg;
}

Also, the return value does not need a return if it is the last evaluation.

sub func1 {
  my $arg = shift;

  # return may or may not match.
  $arg;
}

Normally, it is better to add both return and semicolon. When writing a function in one line, it seems that it is often better to write it without using a semicolon or return.

# Constant etc.
sub CONST_VALUE {3}

When you want to try Perl a lot

If I want to run an example I find somewhere, I have a simple file called a.pl and I experiment there. It's also a hassle to delete, so if you want to experiment anew, use "__END__" and the script will be considered to end there.

perl a.pl
use strict;
use warnings;

print'aaa';

__END__

print'bbb';

Do not end the program when the last line is executed in the debugger

If you want to avoid the program ending when you execute the last line when using the Perl debugger, put the meaningless "1;" executable statement in the last line as follows.

my $str = 'aaa';
print $str;

1;

Shortcut for reading files

If you're writing a disposable script, writing a file open can be a bit tedious. You can use the diamond operator alone to read a file line by line from command line arguments or standard input.

# Command line
perl script.pl file1 file2
# Read line by line from file
while (my $line = <>) {
  ...
}

This description is

while (defined(my $line = <>)) {
  ...
}

Since it has the same meaning as, it is safe because it correctly reads lines that contain only 0.

Notes on variable expansion

If the variable name is followed by a colon or underscore, the variable cannot be expanded correctly. In such a case, enclose it in "{}" to explicitly indicate the variable name.

my $name = 'aaa';
my $message = "${name}::aaa ${name} _ppp";

Take a peek at the symbol table

A data structure in which names such as variable and subroutine are registered. The symbol for each package is stored in a variable with "::" at the end of the package name.

# Symbol table
%main::%CGI::%File::Basename::</pre>

<h3>use each function only in while</h3>

Each function has an iterator internally, so if you use it only once, the iterator will stay advanced.

<pre>
# Iterator stays advanced.
my ($key, $value) = each %hash;

Activating the keys function resets the iterator, but that's not a good idea.

# Reset iterator
keys %hash;

Use each function only in the loop of while statement.

while (my ($key, $value) = each %hash) {
  ...
}

The filehandle passed to the print function is not the first argument

It's a strange specification. The filehandle passed to the print function is not the first argument. There is no comma behind the file handle.

# $fh is not the first argument.
print $fh $output;

This is called indirect object notation and has the same meaning as:

$fh->print($output);

On Windows IO::Poll and select only work for sockets

On Windows, asynchronous IO processing can only be done on sockets. Not for files.

On Windows, it doesn't work well if the argument of the glob function contains a space.

On Windows, it doesn't work well if the argument of the glob function contains a space. To get this to work, you need to enclose it in double quotes.

my @files = glob('"C:/Documents and Settings/*. *"');

If you need parentheses for the subroutine

If the subroutine has already been declared, no parentheses are needed.

# If the subroutine has already been declared, no parentheses are needed.
sub func1 {print $_[0]}
func1'aaaa';

The same is true when importing.

# You don't need parentheses when you import
use Carp 'croak';

croak'aaa';

Parentheses are required if declared later.

# Need parentheses if declared after
func1 ('aaa');
sub func1 {print 1}

Anonymous subroutine exist from start to finish

Anonymous subroutine that appear to be dynamically created are compiled at the beginning of the program and destroyed at the end of the program. The following code produces a subroutine reference, not the subroutine itself. Be careful as it can easily get mixed up.

{
  # What is generated is a subroutine reference, not the subroutine itself
  my $sub = sub {
    ...
  }
}

Regular expression list context

If you want to get a part of the matched character using regular expression, write as follows in combination with if statement.

if ($str =~ /regular expression/) {
  my $match1 = $1;
  my $match2 = $2;
}

Another way to write it is to evaluate the regular expression in the list context to get the matched characters.

my ($match1, $match2) = $str =~ /regular expression/;

Change the enclosing character of the regular expression

Normally, the regular expression is written as follows.

$str =~ /regular expression/</pre>

There is another way of writing that has the same meaning. Using m has the same meaning as above.

<pre>
$str =~ m/regular expression/;

The reason for this is that you can change the regex enclosing character if you find it very difficult to escape when there are a lot of "/" s in the regex. This is an example of changing the enclosing character to "#".

$str =~ m # http://aaa.com#;

ord function is an abbreviation for ordinal

The ord function that returns a code point when a string is passed is an abbreviation for ordinal. This means ordinal when translated into Japanese. By the way, the paired chr function converts it to a character when you pass a code point.

ord function <->chr function

or is read in English as "if the left side is not true".

Although or is originally a logical operator, it is often used for conditional branch without being used as a logical operator.

A or B

Perl's or operator has the characteristic that if the left side (A) is true, "A or B" will be true regardless of the result of the right side (B), so the right side will not be executed. This feature is used when it is used for conditional branch.

That is, the left sideIf (A) is true, the right-hand side (B) is executed, and if the left-hand side (A) is false, the right-hand side (B) is executed. When read in English, it can be read as "If the left side (A) is not true, the right side (B) is effective".

Write the configuration file in Perl

It is convenient to write the configuration file in Perl. Use do to load the configuration file.

my $conf_file = "app.conf";
my $conf = do $conf_file
  or die qq/Can't load config file "$conf_file":$! $@/;

The contents of the configuration file.

{
  name =>'Foo',
  number => 9
}

If the file is not read or parsed successfully, the undefined value will be returned. If the file name does not exist, $! Will be set, and if the Perl source code fails to load, $@will be set to the error content.

For some reason, I used to write configuration files in XML or JSON, but when I think about it, it's easiest to write in Perl.

Please note that you should not use do to read anything other than the configuration file you wrote yourself. Otherwise there is a risk that arbitrary scripts will be executed.

Related Informatrion