List of modern Perl writing methods

Perl
here

I will explain how to write standard Perl in Perl 5.8 or later.

If you search the internet, you will find a lot of old descriptions from the time of Perl4. Many books are also written in Perl 4 notation. Perl4 notation can be complicated and error-prone, so it is highly recommended that future Perl writers write in Perl5's modern notation.

strict pragma and warnings pragma (required)

Enable the strict and warnings pragmas.

use strict;
use warnings;

Be sure to write the two lines use strict; and use warnings; at the beginning of the script. These are to tighten Perl's grammar checks. If you do not describe this with a light feeling that it is troublesome, it will be really troublesome later.

Think of it as the only time you don't have to write use strict; and use warnings when writing a command line script called a one-liner.

Use a lexical variable for filehandles (required unless you really have a specific purpose)

Use the lexical variable you just declared for the filehandle. Declaring a lexical variable and specifying it as the first argument of the open function sets the file handle on the lexical variable.

# Lexical variable
my $fh;
my $file = 'file1';

open $fh, '<', $file
  or die "Cannot open'$file':$!";

while (my $lien = <$fh>) {
  ...
}

Lexical variable have the great advantage of having scope and being able to be passed as arguments to other functions. Avoid using symbols such as FH and * FH as in the old commentary.

It would also be more modern to put the declarations of a lexical variable together in the open function.

open my $fh, '<', $file

Use open function with 3 arguments (required)

Try to use the 3-argument open function.

open my $fh, '<', $file

Some old explanations explain the two-argument open function, but don't use it. The two-argument open function is vulnerable to security and should not be used.

open my $fh, "<$file"; # 2 Do not use the open function as an argument

This is also true when opening a pipe.

# ○ (three arguments)
open my $pipe, '-|', 'dir';

# × (two arguments)
open my $pipe, 'dir |';

Perform error handling when opening a file (required)

Make sure to handle the error whenever you open the file.

open my $fh, '<', $file
  or die "Cannot open'$file':$!";

If the open function fails, it returns undef, so use or to handle the error. Since $! Contains the error message returned by the OS, include it in the error message shown to the user.

Use the die function to terminate the program with an error message. From the perspective of other programs, the exit code is 255.

Be sure to perform error handling not only when opening a file but also when communicating with the outside of the program. External is everything except memory manipulation. Files, networks, etc. are external.

Use lowercase letters and underscores in a lexical variable and subroutine names (highly recommended)

Use lowercase letters and underscores in the names of "a lexical variable declared with my" and "subroutine".

# Lexical variable
my $user_name;
my $search_word;
my $max_database_connection;

# Subroutine
sub parse_data {
  ...
}

sub create_table {
  ...
}

Whether this notation is good or bad, most of the new modules on CPAN are written in this notation. There are many benefits to following this habit in that it can provide users with a unified interface.

I have written a separate article on how to name variable, so please refer to this as well.

As a general rule, the naming method for subroutine is "verb + noun". For those whose meaning is clear, it may be possible to use only "verbs" for the convenience of the user. However, I think it is safe to use "verb + noun" because it may cause problems later.

Use uppercase letters and underscores for a package variable (highly recommended)

Use uppercase letters and underscores for a package variable.

our $OBJECT_COUNT;
our $CLASS_INFO;

Use a lexical variable instead of a package variable (required)

If you are not the author of the module, you will not have the opportunity to use a package variable. If you're using a package variable in a single script, that's the wrong way to use it. Let's change it to a lexical variable using my.

Write in standard (?) Code format (slightly recommended)

I like and dislike how to write code, but I think it's a little better to match it with the writing style introduced in Perl Best Practices and the format output by a code formatting tool called Perltidy.

As an example, I will link the source code of Mojo::URL. You can remember it by imitating this.

I will write a few points to imitate.

1. Look at the position where spaces such as if and foreach statement and subroutine are inserted

# There is a space immediately after if, there is no space in (), etc.
if ($flg) {
  ...
}

2. See how to write comments and how to insert spaces

Let's see how to write a comment and how to insert a line space. Most modules on CPAN don't have kind comments, but personally I'd appreciate it if you could write some concise comments in the source code.

3. No tabs, indent width is 4 (or 2)

For the time being, this is what Perl best practices say. (Recently I'm Space 2). Also, some people may want to use tabs, so this is a personal preference.

Use the Encode module to properly handle multibyte characters such as Japanese (highly recommended)

Use the Encode module to properly handle multibyte characters such as Japanese. This is an article, so I will link it.

Techniques such as using Jcode.pm or Jcode.pl as in the old commentary are no longer recommended. Starting with Perl 5.8, using the Encode module is the standard and less problematic method.

Do not use the default variable $_ (highly recommended)

Perl has a default variable called $_. The default variable is a variable that is implicitly received if no argument is given to the function. Please refrain from using it in a program as it will reduce readability.

Use the default arguments only if:

1. One liner

I think you can use it in one liner. $_ Is used as a print argument and a regular expression target.

# One liner that retrieves lines containing the characters AAA
perl -ne "print if/AAA /";

2. map function, grep function and postfix for

Since $_ is passed to the map function, grep function, and postfix for, this must be used.

my @greped_array = grep {$_ =~ /AAA /} @array;
my @mapped_array = map {$_ * 2} @array;
print $_ for @array;

Some well-known CPAN modules use default variable, but I personally don't recommend them. A program that is as explicit as possible will be more readable.

Declare a lexical variable in foreach statement (highly recommended)

Perl 5 allows you to declare a lexical variable at the beginning of foreach statement.

my @students = ('taro', 'kenji', 'naoya');
for my $student (@students) {
  ...
}

In this example, each element of @students goes into $student. $student is a lexical variable that has a scope from the beginning to the end of the foreach block.

It is possible to omit the lexical variable, but it is not recommended.

# Not recommended writing
for (@students) {
  ...
}

How to receive command line arguments (reference)

command line argumentsIt is good to receive it with a feeling.

# When there is one command line argument
my $file = shift;

# If there are multiple command line arguments
my ($file, $option) = @ARGV;

How to receive subroutine arguments (reference)

Same as for command line arguments.

# When there is one argument
sub func {
  my $file = shift;
}

# When there are multiple arguments
sub func {
  my ($file, $option) = @_;
}

Use the standard module of date processing (reference)

If you are using Perl 5.10 or later, a module called Time::Piece is attached as standard and can be used for date processing.

You may also want to install the DataTime module if you can install it from CPAN. This is highly functional but a little heavy.

If you can't do that, try localtime or Time::Local.

Do not load unnecessary modules (required)

If you have copied the source code of another program, extra modules may be loaded even though you do not use it in that program. Be sure to remove this as it can be misleading to those who read it later.

# Because I copied the source code from another program
# Unnecessary module loading may remain
# be careful
use File::Spec;
use File::Basename 'basename';

How to write Perl documentation (reference)

For small scripts used at work, it's a good idea to embed the documentation inside the script. In the case of the CPAN module, the end of the source code is the document, but in the case of a small script, it is convenient to write it at the beginning so that the user can see it at a glance when opening the source code.

Perl documentation is written in a notation called POD. I will only introduce a simple way to write it. The heading is "=head1". Write the title to the right of "=head1". Write the text with a line below it. Please note that it makes sense to open a line. The end of the document is the line "=cut". If written in English, it will be as follows.

=head1 SCRIPT NAME

SomeScript.pl

=head1 DESCRIPTION

This script is used to do ....

=head1 USAGE

perl SomeScript.pl file1 file2 ...

=cut

# Beginning of source code
use strict;
use warnings;

For local projects, I think it's best to write in Japanese to convey it to your colleagues in an easy-to-understand way.

=head1 script name

SomeScript.pl

=head1 overview

It is for

=head1 How to use

perl SomeScript.pl file1 file2 ...

=cut

# Beginning of source code
use strict;
use warnings;

Avoid the storm in comments (recommended)

# I often work with comments full of , but I don't personally recommend it. The main reason is that once you write the comment, someone who comes later has to imitate it. It makes me sick when I think that I have to imitate this to write one function. Also, instead of improving the quality of the code, it often happens that the comments are not catching up when the function is rewritten. So let's stop.

# ################################################# # ##############
# Function name: Honyara #
# Arguments: Argument 1 Argument 2 #
# Return value: Honyara #
# Created: Ahhhh #
# Creator: Hore Hore #
# Function description: Good Good Good #
# Update history: Part 1 #
# : Part 2 #
# : Part 3 #
# ################################################# # ##############
sub func {
    
}

We recommend this writing style.

# Brief commentary (in one line)
sub func {
    
}

Please also refer to.

String list operator (reference)

The string list operator is often used, so I will explain it. The string list operator is often used to create an array of strings.

my @strings = qw/aa bb cc/;

It has the same meaning as the following description.

my @strings = ('aa', 'bb', 'cc');

Explicit when importing module functions (highly recommended)

It's a good idea to make it explicit when importing functions in a module. A reader of the source code can easily understand which module the function belongs to.

use File::Basename 'basename';
use File::Copy qw/copy move/;
use File::Path 'mkpath';
use Encode qw/encode decode/;

# Use the mkpath function.
mkpath $dir;

What if there was no explicit import description?

use File::Basename;
use File::Copy;
use File::Path;
use Encode;

# I don't know where this was imported from
mkpath $dir;

In such a case, you may end up reading the documentation for all the modules being used. You may know from which module the function was imported, but it's not explicit to anyone reading the source code. So be sure to explicitly specify the function you want to import, no matter how obvious it may seem to you.

(Reference) File::Path, File::Copy

Don't use goto (required except in really special cases)

Perl has goto statement, but you shouldn't use it. Programming with goto is a thing of the past, not just Perl. If for some reason you want to use goto, be sure to have an alternative.

If you want to perform loop control, use "last" and "next". If you want to handle the error, use die to throw the exception.

(The only time you have to use goto is in a really special case, such as an infinite recursive call, where you don't want to deepen the hierarchy of the function.)

do - while is not used (recommended)

Statements that can be described by the do while statement can always be described by while statement. The use of the do while statement does not mean that the description will be concise. On the contrary, I feel that it becomes difficult to understand the intention because I do not use it normally.

Statements that can be described by the do while statement can always be described by while statement, so it is recommended to consider the logic that uses while statement.

Do not use redo(recommended)

Perl has redo statement, but you can write the same logic without using redo. I've used redo several times, but I find programming with redo very confusing. Since the same logic can always be written without using redo, it is recommended to consider logic that does not use redo.

Do not use prototype (highly recommended)

There is a feature that allows you to specify a type called prototype when defining a subroutine, but this is not used.

# Do not use prototypes
sub func ($@) {
  ...
}

Perl accepts arguments of any type without explicitly specifying the type, and can have any number of arguments. So there is no need to specify the type or number in the prototype. Therefore, be sure to define a subroutine that does not specify a prototype.

sub func {
  ...
}

Use die instead of returning undef as a return value when reporting an error. (Recommended)

You might think that Perl has no exception handling. There is no such thing as an exception object like Java, but it does have a concise exception mechanism.First, let's see how to return undef to the return value that was the old error handling. If you write a single return when an error occurs, undef is returned for scalar context and an empty list () is returned for list context.

# Return undef when an error occurs
sub func {
  my $arg = shift;
  
  ...
  
  # Error handling
  if ($error) {
    return;
  }
  # Correct value if no error occurred
  return $val;
}

Then, the error handling is described on the side that calls the function.

my $val = func ();

# Exit the program if $val is a false value
die "Error" unless $val;

The problem with this description is that if the func user neglects to check the return value, the program will proceed.

So current Perl uses die to throw an exception when reporting an error.

# Use die to throw an exception when an error occurs
sub func {
  my $arg = shift;
  
  # Some processing
  
  # Use die to throw an exception
  if ($error) {
    die "Error message";
  }

  # Correct value if no error occurred
  return $val;
}

If you do this, func will be called and if an error occurs, an error message will be displayed and the program will terminate.

# If an error occurs, an error message will be displayed and the program will end.
func ();

If you don't want to quit the program, use eval block. Think of it as a catch in Java. If an error occurs, the content of the error is set in a predefined variable called $@, so you can check this variable to see if an error has occurred.

eval {func ()};

if ($@) {
  # Describe the processing when an error occurs
}

Use the arrow operator when calling the constructor

It's a good idea to use the arrow operator to call the constructor. In Perl, there is virtually no difference between a constructor and other methods.

my $obj = SomeClass->new;

Indirect object calls may be deprecated in the future.

# Indirect object call
my $obj = new SomeClass ();

Perl ABC