1. Perl
  2. File input/output
  3. here

Read a csv file and convert it to an array of Perl hashes

This is an example that reads a csv format file and converts it into an array of hashes.

Readability is a bit low because we are actively using a feature called hash slice. If you haven't used Perl for a while and aren't used to it, it may look cryptographic. The structure is almost the same as the previous example, so I will explain only the different parts.

use strict;
use warnings;

# Argument processing
my $file = shift;

unless ($file) {
    die "Usage: $0 file"; # If there is no argument, indicate the usage and exit.
}

# Create a key for each column of csv
my $headers = ['name', 'age', 'country'];

# Parse the file and convert the csv format data to an array of arrays.
# The header key is also passed as an argument.
my @recs = parse_file ($file, $headers);

# Output (because it is an array of hashes, follow it with for)
for my $items (@recs) {
  # Concatenate using hash slice
  print join(',', @{$items}{@$headers}) . "\n";
}

# Function for file analysis (I just write it back this time ...)
sub parse_file {
  my ($file, $headers) = @_;
  
  open(my $fh, "<", $file)
    or die "Cannot open $file for read:$!";
  
  # Prepare a reference to an array that stores multiple records
  my $recs = [];
    while (my $line = <$fh>) {
      # Remove a line break
      chomp $line;
      
      # Prepare a reference to the hash that stores the data
      # Header using hash slice
      Assign to the key corresponding to #
      my $items = {};
      @{$items}{@$headers} = split(',', $line);

      # Since the first argument of the push function is an array, dereference it with @$recs.
      push @$recs, $items;
    }
    close $fh;
    wantarray? return @$recs: return $recs;
}

Below is an example of csv data. Create a csv file, give it to the first argument of the script, and execute it.

masao, 10, Japan
taro, 20, USA
rika, 38, France

Code explanation

(1) Creating an array of hash keys

my $headers = ['name', 'age', 'country'];

By using the hash key, we give meaning to what was just an array last time. The first item of the csv file corresponds to name, the second item corresponds to age, and the third item corresponds to country.

(2) Read the file and convert the csv format text to an array of hashes

my @recs = parse_file ($file, $headers);

parse_file is a self-made function that reads a file and converts csv format text into an array of hashes. The input and output images are as follows.

masao, 10, Japan
taro, 20, USA
rika, 38, France

↓

@recs = (
  {name =>'masao', age => 10, country =>'Japan'},
  {name =>'taro', age => 20, country =>'USA'},
  {name =>'rika', age => 38, country =>'France'},
)

(3) Output an array of hashes

for my $items (@recs) {
  print join(',', @{$items}{@$headers}) . "\n";
}

Since it is an array, the outside is a foreach loop. Output $items, which is a reference to the hash, concatenated with commas with join function.

(3) - 1 Explanation of hash slice

I will explain the esoteric hash slice @{$items}{@$headers}.

First, $items is a reference to the hash.

{name =>'masao', age => 10, country =>'Japan'}

It looks like this.

I usually use%$items to dereference, but I want to dereference it so that hash slice can be used.

@{$items}{list of hash keys}

will do.

@{$items}{'name', 'age', 'country'}

If you write

('masao', 10, 'Japan')

Can be obtained.

Also, since ['name', 'age', 'country'] is assigned to $headers, you can get ('name', 'age', 'country') by dereferencing @$headers. increase.

@{$items}{@$headers}

Is the description

@{$items}{'name', 'age', 'country'}

Is the same as the description

('masao', 10, 'Japan')

Can be obtained.

join(',', @{$items}{@$headers})

As a result, it is completed by connecting with commas.

(4) Processing in a while loop

I'm using while statement to create a Perl data structure.

while (my $line = <$fh>) {
  # Remove a line break
  chomp $line;

  # Prepare a reference to the hash that stores the data
  my $items = {};
  # Header using hash slice
  Assign to the key corresponding to #
  @{$items}{@$headers} = split(',', $line);

  # The first argument of the push function is an array, so @$recs
  # And dereference.
  push @$recs, $items;
}

(4) - 1 Prepare a reference to the hash

my $items = {}; # Prepare a reference to the hash that stores the data

Prepare a reference to the hash you want to have in the array.

(4) - 2 Assign hash slice to corresponding key on the left side

@{$items}{@$headers} = split(',', $line);

Same as explained above. Make the left side a hash slice and assign the list divided by the split function to it.

Related Informatrion