1. Perl
  2. XS
  3. here

perlapi - Data manipulation functions in XS

perlapi is an official API for manipulating Perl data with XS, and is a macro or function written in C language.

Output, scalar, standard input/output

It seems that XS is easy to understand if you understand the correspondence between Perl and XS (C language) API.

Output to standard output

# Perl
print "Hello";

# XS
PerlIO_printf(PerlIO_stdout (), "Hello");

PerlIO_printf is a function that corresponds to printf in the C standard library. The first argument is a PerlIO * type variable, and the second argument is a char * type variable. The standard output can be obtained with the PerlIO_stdout function. In XS, the standard library of standard input/output C is not used, but the functions provided by Perl are used. This is detailed in a document called perlapio.

String

# Perl
my $str = "Hello";

# XS
SV * sv_str = newSVpv ("Hello", 0);

Perl scalars support SV * in XS. You can use the newSVpv function to create an SV * that represents string. The first argument is a string. The second argument is a STRLEN type argument that specifies the length of the string. If you specify 0, the length of the string is automatically determined.

Others include the newSVpvn and newSVpvs functions, which focus on efficiency.

The C language API is often omitted, so it's difficult to remember, but I have to do my best until I get used to it. In SV, S means scalar and V means value. p in newSVpv means pointer. This effectively represents a string. So p is remembered as a string.

Numerical value

# Unsigned integer
# Perl
my $num = 1;

# XS
SV * sv_num = newSVuv (5);
# Integer
# Perl
my $num = -1;

# XS
SV * sv_num = newSViv (-1);
# Numerical value
# Perl
my $num = 1.14;

# XS
SV * sv_num = newSVnv (1.14);

Perl treats all numbers as floating point, but internally they are unsigned integers (UV type), integers ( There are three types: IV type) and numerical type (NV type). Numeric types are the most versatile, but unsigned integers and integers are better for efficiency. These are retained as SV * internal values. UV U is "unsigned int" u. I in IV is i in int. NV's N is number n.

Undefined value (corresponding to undef)

In XS, Perl's undef is represented by & PL_sv_undef.

# Perl
my $val = undef;

# XS
SV * sv_val = & PL_sv_undef;

Check if the value is defined(corresponding to the defined function)

Use the SvOK function to see if a value of type SV * is defined. This is the equivalent of Perl's defined function.

# Perl
if (defined $val) {...}

# XS
if (SvOK (sv_val)) {...}

Conversion from SV * to char * type

If SV * contains a string or number, it must be of char * type when it is used in other functions.

char * str = SvPV_nolen (sv_some);

The first argument of the SvPV_nolen function is an SV * type variable. The length will be calculated automatically.

File input/output

# Perl
sub print_line {
  my $file = shift;
  
  open my $fh, '<', $file
    or die $!;
  
  while (my $line = <$fh>) {
    print $line;
  }
  
  close $fh;
}

# XS
SV *
print_line (...)
  PPCODE:
{
    char * file = SvPV_nolen (ST (0));
    PerlIO * infh = PerlIO_open(file, "r");
    SV * line = sv_2mortal (newSVpv ("", 0));
    
    if (! infh) {
      croak (strerror (errno));
    }
    
    while (sv_gets (line, infh, 0)) {
      PerlIO_printf(PerlIO_stdout (), SvPV_nolen (line));
    }
    PerlIO_close(infh);
}

difficult. I wonder if this is the file input/output.

Keep SV * volatile

Whenever you create an SV *, make it volatile by sv_2mortal. Then, it will be automatically released at the position where it goes out of the scope.

The description method to prevent memory leak in XS is explained below.

Discover APIs for Perl operations

In order to write XS, it is necessary to perform the operation that is usually done in Perl with the API of C language. This is quite difficult, but since all the APIs published in perlapi and perlapio are listed, search here. Perlguts, which explains the inside of Perl, is also helpful.

Array manipulation

How to write in XS corresponding to the operation of array.

Variable declaration.

# Perl
my @nums;

# XS
AV * av_nums = newAV ();

Substitution.

# Perl
$nums[0] = 1;

# XS
av_store (av_nums, 0, newSVuv (1));

Get the element. Since av_fetch returns a pointer to SV *, you need to remove the pointer with *. The third argument specifies whether to create the element if it does not exist. Since av_fetch returns NULL if there is no element, you need to check it before dereferencing.

# Perl
my $num = $nums[0];

# XS
SV ** const sv_num_ptr = av_fetch (nums, 0, FALSE);
SV * const sv_num = num_ptr? * Sv_num_ptr: & PL_sv_undef;

The size of the array.

# Perl
my $count = @nums;

# XS (Note that a value 1 less than the length is returned due to historical reasons)
int count = av_len (nums) + 1;

push function.

# Perl
push @nums, 1;

# XS
av_push(av_nums, newSVuv (1));

pop function.

# Perl
my $num = pop @nums;

# XS
SV * sv_num = av_pop(av_nums);

shift function.

# Perl
my $num = shift @nums;

# XS
SV * sv_num = av_shift(av_nums);

unshift function. Since av_unshift does not add a value to the beginning, it just adds an empty area to the beginning, so you need to set the value in av_store.

# Perl
unshift @nums, 1;

# XS
av_unshift(av_nums, 1);
av_store (av_nums, 0, newSVuv (1));

Empty the array.

# Perl
@nums = ();

# XS
av_clear (av_nums);

Hash operation

How to write in XS corresponding to the operation of hash.

Confirmation of key existence

# Perl
exists $hash{"key"};

# XS
hv_exists(hash, "key", strlen ("key"));

Creating an array of hashes

I've been practicing XS recently, and I'm making it with the following policy for the time being, but I'm not sure if it matches.

  1. The return value and arguments are written in one line
  2. Use only the PPCODE section.
  3. Enclose the source code in {} so that you can declare variable at the beginning
  4. The return value should be only one SV *. Substitute in ST (0) and then XSRETURN (1).. Ignore the warning that you are not using RETVAL.
  5. SV *, AV *, HV * execute sv_2mortal immediately after creating. Failure to do this will result in a memory leak. You don't have to create an SV * and sv_2mortal only if you want it to be an element of a hash or array.

When writing XS, I will try to write it like this for a while. The XS grammar is quite complicated, so I want to write it as simply and consistently as possible. Let's try for a while how often this works.

Creating an array of hashes

Creating an array of hashes.

# The simplest Perl notation
sub return_array_of_hash {
    
  my $people = [
      {name =>'Ken', age => 19},
      {name =>'Taro', age => 16}
  ];;
  
  return $people;
}

# How to write XS Perl notation corresponding
sub return_array_of_hash {
    
  my @persons;
 
  my %person1;
  $person1{name} = 'Ken';
  $person1{age} = 19;
  push @peoples, \%person1;
  
  my %person2;
  $person2{name} = 'Taro';
  $person2{age} = 26;
  push @persons, \%person2;
  
  return \@persons;
}

# XS
SV * return_array_of_hash ()
    PPCODE:
{
  AV * av_persons;
  HV * hv_person1;
  HV * hv_person2;
  
  av_persons = (AV *) sv_2mortal ((SV *) newAV ());
  
  hv_person1 = (HV *) sv_2mortal ((SV *) newHV ());
  hv_store (hv_person1, "name", 4, newSVpv ("Ken", 3), 0);
  hv_store (hv_person1, "age", 3, newSVuv (19), 0);
  av_push(av_persons, newRV_inc ((SV *) hv_person1));
  
  hv_person2 = (HV *) sv_2mortal ((SV *) newHV ());
  hv_store (hv_person2, "name", 4, newSVpv ("Taro", 4), 0);
  hv_store (hv_person2, "age", 3, newSVuv (26), 0);
  av_push(av_persons, newRV_inc ((SV *) hv_person2));
  
  SV * sv_ret = sv_2mortal (newRV_inc (av_persons));
  xPUSHs (sv_ret);
  XSRETURN (1);
}

AV * is an array and HV * is a hash. newAV creates an empty array. newHV creates an empty hash. hv_store sets the hash value, the third argument is the key length, and the fifth argument is 0, the hash value is calculated automatically. newRV_inc is a function that creates a reference. The argument of XSRETURN is the number of return values. The return value is pushed onto the stack with the XPUSHs macro.

Creating an array of hashes Part 2

Based on gfx's advice, I rewrote the creation of the hash array. The principle of always making it mortal immediately after creating SV *, AV *, HV *, when creating a string, do not specify the number of characters when setting the hash element, for the array or hash element If you do, you have to increase the reference count by one. After that, I defined a macro that can be written as consistently and simply as possible. I don't know.

# define new_mAV () (AV *) sv_2mortal ((SV *) newAV ())
# define new_mHV () (HV *) sv_2mortal ((SV *) newHV ())
# define new_mSVpvs (s) sv_2mortal (newSVpvs (s))
# define new_mSVuv (u) sv_2mortal (newSVuv (u))
# define new_mSViv (i) sv_2mortal (newSViv (i))
# define new_mSVnv (n) sv_2mortal (newSVnv (n))
# define new_mRV (sv) sv_2mortal (newRV_inc ((SV *) sv))
# define set (e) SvREFCNT_inc (e)

MODULE = ExtModule PACKAGE = ExtModule

SV * return_array_of_hash ()
  PPCODE:
{
  AV * persons;
  HV * person1;
  HV * person2;
  
  persons = new_mAV ();
  
  person1 = new_mHV ();
  hv_stores (person1, "name", set (new_mSVpvs ("Ken")));
  hv_stores (person1, "age", set (new_mSVuv (19)));
  hv_stores (person1, "height", set (new_mSViv (170)));
  hv_stores (person1, "weight", set (new_mSVnv (45.3)));
  av_push(persons, set (new_mRV (person1)));
  
  person2 = new_mHV ();
  hv_stores (person2, "name", set (new_mSVpvs ("Taro")));
  hv_stores (person2, "age", set (new_mSVuv (26)));
  hv_stores (person2, "height", set (new_mSViv (180)));
  hv_stores (person2, "weight", set (new_mSVnv (39.3)));
  av_push(persons, set (new_mRV (person2)));
  
  ST (0) = new_mRV (persons);
  XSRETURN (1);
}

String manipulation

string operation in XS.

Creating a string

# Perl
my $str = "abc";

# XS
SV * sv_str = newSVpv ("abc", 0);

Copy of string

# Perl
my $str2 = $str1;

# XS
SV * sv_str2 = newSV (sv_str1);

String length

# Perl
my $length = length $str;

# XS
STRLEN sv_length = sv_len (sv_str);

Concatenation of strings

# Perl
my $str = "abc";
$str. = "de";

# XS
SV * sv_str = newSVpv ("abc", 0);
sv_catpv (sv_str, "de");

Variable expansion (or sprintf)

# Perl
my $num1 = 1;
my $num2 = 2;
my $str = "$num1 and $num2";

# XS
int num1 = 1;
int num2 = 2;
SV * sv_str = newSVpvf ("%d and%d", num1, num2);

String concatenation + variable expansion

# Perl
my $num1 = 1;
my $num2 = 2;
my $str = "abc";
$str. = "$num1 and $num2";

# XS
int num1 = 1;
int num2 = 2;
SV * sv_str = newSVpv ("abc", 0);
sv_catpvf (sv_str, "%d and%d", num1, num2);

String comparison

Use sv_cmp if you want to compare strings that are SVs.

# perl
if ($str1 lt $str2) {...}
if ($str1 eq $str2) {...}
if ($str1 gt $str2) {...}

# XS
if (sv_cmp (sv_str1, sv_str2) <0) {...}
if (sv_cmp (sv_str1, sv_str2) == 0) {...}
if (sv_cmp (sv_str1, sv_str2)> 0) {...}

There are also functions strEQ, strEQ, strGE, strGT, strLE, strLT, strNE that can be compared as char * types.

if (strEQ ("foo", "bar")) {...}
if (strGE ("foo", "bar")) {...}
if (strGT ("foo", "bar")) {...}
if (strLE ("foo", "bar")) {...}
if (strLT ("foo", "bar")) {...}
if (strNE ("foo", "bar")) {...}

Package variable

Introducing functions for manipulating a package variable in XS.

Get a package variable

To get the package variable, write:

# Perl
$Foo::bar;
@Foo::bar;
%Foo::bar;

# XS
SV * sv_var = get_sv ("Foo::bar", 0);
AV * av_var = get_av ("Foo::bar", 0);
HV * hv_var = get_hv ("Foo::bar", 0);

The second argument, 0, means that no new variable is created.

Creating a package variable]

Introducing a function that creates a package variable in XS. Specify "GV_ADD" in the second argument of get_sv, get_av, get_hv.

# Perl
our $Foo::bar;
our @Foo::bar;
our%Foo::bar;

# XS
SV * sv_var = get_sv ("Foo::bar", GV_ADD);
AV * av_var = get_av ("Foo::bar", GV_ADD);
HV * hv_var = get_hv ("Foo::b"ar ", GV_ADD);

Use a regular expression

I will explain how to use regular expression in XS . However, the Perl documentation doesn't currently explain how to use it, so I'm trying it out, so please let me know if it's wrong.

Regular expression compilation

To use regular expression in XS, you must first compile the regular expression using the pregcomp function. Think of it as corresponding to the operation of creating a regular expression reference.

# Perl
my $re = qr/[0-9]+/;

# XS (the second argument of pregcomp is a regular expression flag)
SV * sv_re_str = newSVpv ("[0-9]+", 0);
REGEXP * sv_re = pregcomp (sv_re_str, 0);

The REGEXP * type is a type of SV * type, so when you actually use the code, use sv_2mortal so that the memory is automatically released. If you write it assuming that it will be used in actual code, it will be as follows.

SV * sv_re_str = sv_2mortal (newSVpv ("[0-9]+", 0));
REGEXP * sv_re = (REGEXP *) sv_2mortal ((SV *) pregcomp (sv_re_str, 0));

Regular expression flags

You can specify the regular expression flag as the second argument of pregcomp.

# Perl
my $re = qr/abc/im;

# XS
SV * sv_re_str = newSVpv ("[0-9]+", 0);
REGEXP * sv_re = pregcomp (sv_re_str, RXf_PMf_FOLD | RXf_PMf_MULTILINE);

The correspondence of the flags is as follows. To specify more than one, use the bitwise OR "|" to connect the flags.

/m

RXf_PMf_MULTILINE

/s

RXf_PMf_SINGLELINE

/i

RXf_PMf_FOLD

/x

RXf_PMf_EXTENDED

Regular expression execution

Use the pregexec function to execute a regular expression. The arguments of the pregexec function are very complicated.

/ * pregexec - Match a regular expression against a string. */I32
pregexec (
  REGEXP * const prog,/* Compiled regular expression */  char * stringarg,/* Position to start string matching */  char * stretch,/* End of string (null pointer position) */  char * strbeg,/* Start position of string */  SSize_t minend,/* Number of bytes where the end of the match after stringarg must be greater than or equal to minend */  SV * screamer,/* Represents a string SV: Used only for the utf8 flag */  U32 nosave/* Set to 1 if not to capture */)

If the match is successful, the return value will be true. I will write an example.

SV * sv_value = sv_2mortal (newSVpv ("12", 0));
char * value = SvPV_nolen (sv_value);

SV * sv_re_str = sv_2mortal (newSVpv ("^* ([-+]? [0-9]+) * $", 0));
REGEXP * sv_re = (REGEXP *) sv_2mortal ((SV *) pregcomp (sv_re_str, 0));

IV ret = pregexec (
  sv_re, // Compiled regular expression
  value, // Search start position
  value + strlen (value), // end of string (null pointer)
  value, // beginning of string
  0, // 0 is OK
  sv_value, // SV * type string
  0 // 0 is OK
);

Get the matched string

Use the Perl_reg_numbered_buff_fetch function to fetch the matched string. (I wonder if there is only a private API)

# Perl
my $match1 = $1;
my $match2 = $2;

# XS
SV * sv_match1 = newSVpv ("", 0);
Perl_reg_numbered_buff_fetch (aTHX_ sv_re, 1, match1);

SV * sv_match2 = newSVpv ("", 0);
Perl_reg_numbered_buff_fetch (aTHX_ sv_re, 2, match2);

Object - oriented API

Introducing the XS API related to object-oriented.

Create an object

Create an object. Be sure to make it mortal.

# Perl
my $self = {};
bless $self, "MyClass";

# XS
SV * sv_self = sv_2mortal (new_RV_inc (sv_2mortal ((SV *) newHV ()));
sv_bless(sv_self, gv_stashpv ("MyClass", 1));

Check if it is an object

To check if the value is an object, use the sv_isobject function .

sv_isobject (sv_obj);

Check if it inherits a class

Use the sv_derived_from function to see if it inherits from a class. If it inherits the specified class, it will return true.

sv_derived_from (sv_obj, "MyClass");

This function can also be used to check that it is an array reference or a hash reference.

/ * Array reference */sv_derived_from (sv_obj, "ARRAY");

/ * Hash reference */sv_derived_from (sv_obj, "HASH");

So, to make sure it is an object and inherits a particular class, use a combination of the sv_isobject and sv_derived_from functions.

sv_isobject (sv_obj) && sv_derived_from (sv_obj, "MyClass")

Official Perl API documentation

Related Informatrion