diff --git a/XSUB.h b/XSUB.h index 7bb60e5156f4..1fc27cd3ff12 100644 --- a/XSUB.h +++ b/XSUB.h @@ -30,7 +30,7 @@ C>. =for apidoc Amnu|type|RETVAL Variable which is setup by C to hold the return value for an XSUB. This is always the proper type for the XSUB. See -L. +L. =for apidoc Amnu|type|THIS Variable which is setup by C to designate the object in a C++ @@ -44,7 +44,7 @@ must be called prior to setup the C variable. =for apidoc Amn|Stack_off_t|items Variable which is setup by C to indicate the number of -items on the stack. See L. +items on the stack. See L. =for apidoc Amn|I32|ix Variable which is setup by C to indicate which of an diff --git a/dist/ExtUtils-ParseXS/lib/perlxs.pod b/dist/ExtUtils-ParseXS/lib/perlxs.pod index 26b8e19a06d3..5ed2143ff960 100644 --- a/dist/ExtUtils-ParseXS/lib/perlxs.pod +++ b/dist/ExtUtils-ParseXS/lib/perlxs.pod @@ -1,1971 +1,4576 @@ =head1 NAME -perlxs - XS language reference manual +perlxs - the XS Language Reference Manual + +=head1 SYNOPSIS + + /* This is a simple example of an XS file. The first half of an XS + * file is uninterpreted C code; all lines are passed through + * unprocessed. */ + + =pod + Except that any POD is stripped. + =cut + + /* Standard boilerplate: */ + + /* For efficiency, always define PERL_NO_GET_CONTEXT: not enabled by + * default for backwards compatibility. For details, see + * L + */ + #define PERL_NO_GET_CONTEXT + + #include "EXTERN.h" + #include "perl.h" + #include "XSUB.h" + #include "ppport.h" + + /* Any general C code here; for example: */ + + #define FOO 1 + static int + my_helper_function(int i) { /* do stuff */ } + + /* The first MODULE line starts the XS half of the file: */ + + MODULE = Foo::Bar PACKAGE = Foo::Bar + + # Indented '#' are XS code comments. + # C preprocessor directives are still allowed and are passed + # through: + #define BAR 2 + + # File-scoped XS directives + PROTOTYPES: DISABLE + + # A simple XSUB: generate a wrapper for the strlen() C library + # function. + + int + strlen(char *s) + + =pod + A more complex example: + C: do a 16-bit multiply + =cut + + unsigned int + multi16(unsigned int i, \ + unsigned int j) + CODE: + i = i & 0xFFFF; + j = j & 0xFFFF; + RETVAL = (i * j) & 0xFFFF; + OUTPUT: + RETVAL =head1 DESCRIPTION -=head2 Introduction - -XS is an interface description file format used to create an extension -interface between Perl and C code (or a C library) which one wishes -to use with Perl. The XS interface is combined with the library to -create a new library which can then be either dynamically loaded -or statically linked into perl. The XS interface description is -written in the XS language and is the core component of the Perl -extension interface. - -This documents the XS language, but it's important to first note that XS -code has full access to system calls including C library functions. It -thus has the capability of interfering with things that the Perl core or -other modules have set up, such as signal handlers or file handles. It -could mess with the memory, or any number of harmful things. Don't. -Further detail is in L, which you should read before actually -writing any production XS. - -An B forms the basic unit of the XS interface. After compilation -by the B compiler, each XSUB amounts to a C function definition -which will provide the glue between Perl calling conventions and C -calling conventions. - -The glue code pulls the arguments from the Perl stack, converts these -Perl values to the formats expected by a C function, calls this C function, -and then transfers the return values of the C function back to Perl. -Return values here may be a conventional C return value or any C -function arguments that may serve as output parameters. These return -values may be passed back to Perl either by putting them on the -Perl stack, or by modifying the arguments supplied from the Perl side. - -The above is a somewhat simplified view of what really happens. Since -Perl allows more flexible calling conventions than C, XSUBs may do much -more in practice, such as checking input parameters for validity, -throwing exceptions (or returning undef/empty list) if the return value -from the C function indicates failure, calling different C functions -based on numbers and types of the arguments, providing an object-oriented -interface, etc. - -Of course, one could write such glue code directly in C. However, this -would be a tedious task, especially if one needs to write glue for -multiple C functions, and/or one is not familiar enough with the Perl -stack discipline and other such arcana. XS comes to the rescue here: -instead of writing this glue C code in long-hand, one can write -a more concise short-hand I of what should be done by -the glue, and let the XS compiler B handle the rest. - -The XS language allows one to describe the mapping between how the C -routine is used, and how the corresponding Perl routine is used. It -also allows creation of Perl routines which are directly translated to -C code and which are not related to a pre-existing C function. In cases -when the C interface coincides with the Perl interface, the XSUB -declaration is almost identical to a declaration of a C function (in K&R -style). In such circumstances, there is another tool called C -that is able to translate an entire C header file into a corresponding -XS file that will provide glue to the functions/macros described in -the header file. - -The XS compiler is called B. This compiler creates -the constructs necessary to let an XSUB manipulate Perl values, and -creates the glue necessary to let Perl call the XSUB. The compiler -uses B to determine how to map C function parameters -and output values to Perl values and back. The default typemap -(which comes with Perl) handles many common C types. A supplementary -typemap may also be needed to handle any special structures and types -for the library being linked. For more information on typemaps, -see L. - -A file in XS format starts with a C language section which goes until the -first C> directive. Other XS directives and XSUB definitions -may follow this line. The "language" used in this part of the file -is usually referred to as the XS language. B recognizes and -skips POD (see L) in both the C and XS language sections, which -allows the XS file to contain embedded documentation. - -See L for a tutorial on the whole extension creation process. - -Note: For some extensions, Dave Beazley's SWIG system may provide a -significantly more convenient mechanism for creating the extension -glue code. See L for more information. - -For simple bindings to C libraries as well as other machine code libraries, -consider instead using the much simpler -L interface via CPAN modules like -L or L. - -=head2 On The Road - -Many of the examples which follow will concentrate on creating an interface -between Perl and the ONC+ RPC bind library functions. The rpcb_gettime() -function is used to demonstrate many features of the XS language. This -function has two parameters; the first is an input parameter and the second -is an output parameter. The function also returns a status value. - - bool_t rpcb_gettime(const char *host, time_t *timep); - -From C this function will be called with the following -statements. - - #include - bool_t status; - time_t timep; - status = rpcb_gettime( "localhost", &timep ); - -If an XSUB is created to offer a direct translation between this function -and Perl, then this XSUB will be used from Perl with the following code. -The $status and $timep variables will contain the output of the function. - - use RPC; - $status = rpcb_gettime( "localhost", $timep ); - -The following XS file shows an XS subroutine, or XSUB, which -demonstrates one possible interface to the rpcb_gettime() -function. This XSUB represents a direct translation between -C and Perl and so preserves the interface even from Perl. -This XSUB will be invoked from Perl with the usage shown -above. Note that the first three #include statements, for -C, C, and C, will always be present at the -beginning of an XS file. This approach and others will be -expanded later in this document. A #define for C -should be present to fetch the interpreter context more efficiently, -see L for details. - - #define PERL_NO_GET_CONTEXT - #include "EXTERN.h" - #include "perl.h" - #include "XSUB.h" - #include - - MODULE = RPC PACKAGE = RPC - - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep - -Any extension to Perl, including those containing XSUBs, -should have a Perl module to serve as the bootstrap which -pulls the extension into Perl. This module will export the -extension's functions and variables to the Perl program and -will cause the extension's XSUBs to be linked into Perl. -The following module will be used for most of the examples -in this document and should be used from Perl with the C -command as shown earlier. Perl modules are explained in -more detail later in this document. - - package RPC; - - require Exporter; - require DynaLoader; - @ISA = qw(Exporter DynaLoader); - @EXPORT = qw( rpcb_gettime ); - - bootstrap RPC; - 1; - -Throughout this document a variety of interfaces to the rpcb_gettime() -XSUB will be explored. The XSUBs will take their parameters in different -orders or will take different numbers of parameters. In each case the -XSUB is an abstraction between Perl and the real C rpcb_gettime() -function, and the XSUB must always ensure that the real rpcb_gettime() -function is called with the correct parameters. This abstraction will -allow the programmer to create a more Perl-like interface to the C -function. +This is the reference manual for the XS language. This is a type of +template language from which is generated a C code source file that +contains functions written in C, but which can be called from Perl, and +which behave just like Perl subs. These are known as I or +I subs, or XSUBs for short. + +Note that this POD file was heavily rewritten and modernised in 2025. +Various old practices, such as "K&R" XSUB function signature declarations, +are no longer encouraged; but much old code will still be using them, so +be cautious when using old code as examples for writing new code. + +The syntax described in this document is valid for Perl installations back +to 5.8.0 unless otherwise specified. + +=head1 THE FORMAL SYNTAX OF AN XS FILE + +This is a BNF-like description of the syntax of an XS file. It is intended +to be human-readable rather than machine-readable, and doesn't try to +accurately specify where line breaks can occur. + + Key: + + foo BNF token. + "bar" Literal terminal symbol. + /.../ Terminal symbol defined by a pattern. + [Foo::Bar] Terminal symbol defined by way of an example. + * + ? | ( ) These have their usual regex-style meanings. + // ... BNF Comments. + + + XS_file = C_file_part ( module_decl XS_file_part )+ + + C_file_part = ( + // Lines of C code (including /* ... */), + // which are all passed through uninterpreted. + | + pod // These are stripped. + )* + + pod = /^=/ .. /^=cut\s*$/ + + module_decl = blank_line + // NB: all on one line: + "MODULE =" [Foo::Bar] "PACKAGE =" [Foo::Bar] + ( "PREFIX =" [foo_] )? + + blank_line = /^\s*$/ + + XS_file_part = ( file_scoped_decls* xsub )* + + file_scoped_decls = + blank_line + // Any valid CPP directive: these are passed through: + | "#if" | "# if" | "#define" | // etc + | #comment // anything not recognised as CPP directive + | pod + | "SCOPE:" enable + | "EXPORT_XSUB_SYMBOLS:" enable + | "PROTOTYPES:" enable + | "VERSIONCHECK:" enable + | "FALLBACK:" ("TRUE" | "FALSE" | "UNDEF") + | "INCLUDE:" [foo.xs] + | "INCLUDE_COMMAND:" [... some command line ...] + | "REQUIRE:" [1.23] // min xsubpp version + | "BOOT:" + code_block + | "TYPEMAP: <<"[EOF] + // Heredoc with typemap declarations. + [EOF] + + enable = ( "ENABLE" | "DISABLE" ) + + code_block = // Lines of C and/or blank lines terminated by the + // next keyword or XSUB start. POD is stripped. + + xsub = blank_line // not *always* necessary + xsub_decl + ( cases | xbody ) + + xsub_decl = return_type + xsub_name "(" parameters ")" "const" ? + + return_type = "NO_OUTPUT" ? "extern \"C\"" ? "static" ? C_type + + C_type = [const char *] // etc: any valid C type + + C_expression = [foo(ix) + 1] // etc: any valid C expression + + xsub_name = [foo] | [X::Y::foo] // simple name or C++ name + + parameters = empty + | parameter ( "," parameter )* + + empty = /\s*/ + + parameter = ( + in_out_decl ? + C_type ? + /\w+/ // variable name + // Default or optional value: + ( "=" ( C_expression | "NO_INIT" ) )? + + // Pseudo-param: foo must match another param name: + | C_type "length(" [foo] ")" + | "..." + ) + + in_out_decl = "IN" | "OUT" | "IN_OUT" | "OUTLIST" | "IN_OUTLIST" + + cases = ( + "CASE:" ( C_expression | empty ) + xbody + )+ + + xbody = implicit_input ? + xbody_input_part * + xbody_init_part * + xbody_code_part + xbody_output_part * // Not after PPCODE. + xbody_cleanup_part * // Not after PPCODE. + + implicit_input = ( blank_line | input_line )+ + + xbody_input_part = + "INPUT:" ( blank_line | input_line )* + | "PREINIT:" + code_block + | xbody_generic_key + | c_args + | interface_macro + | "SCOPE:" enable // Only in perl 5.44.0 onwards. + + input_line = C_type + "&" ? + /\w+/ // variable name + // Optional initialiser: + ( + ( "=" | ";" ) "NO_INIT" + | + // Override or add to the default typemap. + // The expression is eval()ed as a + // double-quotish string. + "=" [ a_typemap_override($arg) ] + | + ("+" | ";") [ a_deferred_initialiser($arg) ] + )? + ";" ? + + xbody_init_part = "INIT:" + code_block + | xbody_generic_key + | c_args + | interface + | interface_macro + + xbody_code_part = + autocall + | "CODE:" + code_block + | "PPCODE:" + code_block + | // Only recognised if immediately following + // an INPUT section: + "NOT_IMPLEMENTED_YET:" + + // Implicit call to wrapped library function. + autocall = empty + + xbody_output_part = + xbody_postcall * + xbody_output * + + xbody_postcall = "POSTCALL:" + code_block + | xbody_generic_key + + xbody_output = "OUTPUT:" + ( blank_line + | output_line + | "SETMAGIC:" enable + )* + | xbody_generic_key + + // Variable name with optional expression which + // overrides the typemap + output_line = /\w+/ ( [ sv_setfoo(ST[0], RETVAL) ] )? + + xbody_cleanup_part = "CLEANUP:" + code_block + | xbody_generic_key + + // Text to use as the arguments for an autocall; + // may be spread over multiple lines: + c_args = "C_ARGS:" [foo, bar, baz] + + // Comma-separated list of Perl subroutine names + // which use the XSUB, over one or more lines: + interface = "INTERFACE:" [foo, bar, Bar::baz] + + interface_macro = + "INTERFACE_MACRO:" + [GET_MACRO_NAME] + [SET_MACRO_NAME] ? + + + // These can appear anywhere in an XSUB. + xbody_generic_key = pod + | alias + | "PROTOTYPE:" ( enable | [$$@] ) + + // Whitespace-separated list of overload types, + // over one or more lines: + | "OVERLOAD:" [ cmp eq <=> etc ] + + // Whitespace-separated list of attribute names, + // over one or more lines: + | "ATTRS:" [foo bar baz] + + + alias = "ALIAS:" + // One or more lines; each with zero or more + // {alias_name, op, index} triplets: + ( + [bar] "=" [5] + | [Foo::baz] "=" [A_CPP_DEFINE] + | [Bar::boz] "=>" [Foo::baz] + )* + +=head1 OVERVIEW OF XS AND XSUBS + +=head2 Initial and Further Reading + +This document is structured on the assumption that you are already +familiar with the very basics of XS and XSUBs; in particular, the code +examples may make use of common keywords that are only described later in +the file. But once you have that basic familiarity, then this document +may be read through in order. + +It is in two main parts. First there is a long overview part, which +explains (in great detail) what XS and XSUBs are, how the perl interpreter +calls XSUBs, and how data is passed to and from an XSUB. There is much +more detail here than is strictly necessary for writing simple XSUBs, but +this document is intended to be comprehensive. Then comes the reference +manual proper, which has a section for each keyword and other parts of an +XSUB declaration and definition, plus a few more general topics, such as +using typemaps and storing static data. + +If necessary, read L first for a gentler tutorial introduction. +In addition, you may find the following Perl documents useful. + +=over + +=item * + +L: this describes typemap files and how to create new +typemaps. These are code templates which are used by the XS compiler to +automatically generate code which converts between Perl and C data types. +Creating an interface to a C library is sometimes mainly a case of adding +new typemap entries to handle the new data types which the library uses. + +=item * + +L: this contains details of selected parts of the internals of +the Perl interpreter. A better understanding of that may help when writing +more complex XSUBs, or when debugging. + +Note that much of the L part of this document is just a summary of +the parts of of perlguts which are most relevant to writing XS code, +possibly saving you from having to actually read that document. + +=item * + +L: this describes how and when to use the basic C and OS library +functions from XS. Often, the Perl API contains functions which you should +use I of the standard C library ones, e.g. using C +instead of C. + +=item * + +L: this describes how to call Perl functions and do the +equivalent of C from C. + +=item * + +L: this describes how to embed a complete Perl interpreter +within another application. + +=back + +=head2 An Introduction to XS and XSUBs + +Formally, an XSUB is a compiled function, typically written in C or C++, +which can be called from Perl as if it was a Perl function. A collection +of them are compiled into a C<.so> or C<.dll> library file and are usually +dynamically loaded at C time (but can in principle be +statically linked into the perl interpreter). + +From Perl, an XSUB looks just like any other sub and is called in the same +way. In the most general case, an XSUB can be passed and return arbitrary +lists of values. More commonly, such as when XSUBs are being used as thin +wrappers to call existing C library functions, they might take a fixed list +of arguments and return a single result (or zero items for a C +function). + +An XS file is a template file format which contains a mixture of C code +and XSUB declarations. It is used to generate XSUBs where most boilerplate +code is handled automatically: e.g. converting argument values between C +and Perl. + +Note that this document refers to both the thing in the XS file, and to +the C function generated from it, as an XSUB. It should be clear from the +context which is being referred to. + +When XSUBs are being used as a thin wrapper between Perl and the functions +in a particular C library, the XSUB definitions in the XS file are often +just a couple of lines, consisting of a declaration of the name, +parameters and return type. The XS parser will do almost all the heavy +lifting for you, + +XS is optional; in principle you can write your own C code directly, or +use other systems such as C or SWIG. For creating simple +bindings to existing compiled libraries, there is also the +L interface via CPAN modules like +L or L. Note that creating XS may initially take +more effort than those, but it is lightweight in terms of dependencies. + +XSUBs have three main roles. They can be used as thin wrappers for C +library functions, e.g. L. They can be used to write +functions which are faster than pure Perl or easier to do in C, e.g. +L. Or they can be used to extend the Perl interpreter itself, +e.g. L. + +XS has extensive support for the first role, and makes writing the second +need less boilerplate code. This document doesn't cover the third role, +which often requires extensive knowledge of the Perl interpreter's +internals. + +The F utility bundled with Perl can in principle be used to generate +an initial XS file from a C header file, which (with possibly only minor +edits) can be used to wrap an entire C API. But note that this utility is +rather old and may not handle more modern C header code. + +F (as well as other tools) can also be used to generate an initial +"empty" skeleton distribution even when not deriving from a header file +(see L for more details). + +Typemaps are sets of rules which map C types such as C to logical XS +types such as C, and from there to C and C templates +such as C<$var = ($type)SvIV($arg)> and C +which, after variable expansion, generate C code which converts back and +forth between Perl arguments and C auto variables. + +There is a standard system typemap file which contains rules for common C +and Perl types, but you can add your own typemap file in addition, and +from perl 5.16.0 onwards you can also add typemap declarations inline +within the XS file. You can either just add mappings from new C types to +existing XS types to make use of existing templates, or you can add new +templates too. As an example of the former, if you're using a C header +file which has: + + typedef int my_int; + +then adding this typemap entry: + + my_int T_IV + +is sufficient for the XS parser to know to use the existing C +templates when processing an XSUB which has a C parameter type. +See L and L for more information. + +An XS file is parsed by L, or by the F utility +(which is a thin wrapper over the module), and generates a C<.c> file. +F is typically called at build time from the Makefile of a +distribution (as generated by L); or +L can be used directly, e.g. by L. The C +file is then compiled into a C<.so> or C<.dll>, again at module build and +install time. + +=head2 The Structure of an XS File + +An XS file has two parts, which are parsed and treated completely +differently: the C half and the XS half. + +Anything before the first L directive line +is treated as pure C (except for any sections of POD, which are +discarded). All such lines, including C preprocessor directives and C code +comments, are passed through unprocessed into the destination C file. XS +comments (as described below) aren't recognised by the XS parser, and are +just passed through unprocessed. + +It is possible that machine-generated C code inserted in this section +could include an equal sign character in column one, which would be +misinterpreted as POD; if this is a risk, make sure that this hypothetical +code generator includes a leading space character. + +This half of the file is the place to put things which will be of use to +the XSUB code further down: such as C<#include>, C<#define>, C, +and static C functions. Note that you should (in general) avoid declaring +static I in XS files; see L for +details and workarounds. + +After the first C line, the rest of the file is interpreted as XS +syntax. Further C keywords may appear where needed to change the +current package (in a similar fashion to a single Perl F file +having multiple C statements). + +This second half consists mostly of a series of XSUB definitions. Between +these XSUBs, there can be a few file-scope keywords (including further +C lines), POD, C preprocessor directives, XS (C<#>) comments, and +blank lines. See L for more +details. + +The XS half of the file can be thought of as being parsed in two stages. +In the initial processing step, the XS parser does the following basic +text processing actions. + +=over + +=item * + +A trailing backslash (i.e. C) within the XS part of the file is +treated as a line continuation: any such series of lines are concatenated. +They are then treated as single line by the main (line-orientated) XS +Parser. This means that the next line loses any special significance; for +example, it may not be recognised as a keyword or end-of-XSUB blank line. + +Note however that the two characters C<"\\\n"> are kept in the +concatenated line, to be subsequently interpreted by the main part of the +XS parser. Mostly these two extra characters will just confuse the parser; +but where such lines are just passed-through to the output C file as-is +(such as for C blocks and C preprocessor directives), this results +in the backslash and newline appearing in the C code. + +The main exception to leaving the continuation characters in the line is +in an XSUB's signature, where any trailing backslashes are stripped away +before parsing. + +=item * + +The parser discards any POD lines. Any sequence of lines matching +C is considered POD. + +=item * + +It discards any XS comment lines. Any line starting with C which +I recognised by the XS parser as a valid C preprocessor directive, +is treated as an XS comment line. + +It is recommended to include at least one space before an XS C<#> comment +to avoid any possible confusion with C preprocessor directives. + +If an XS comment line ends with a backslash, then the line following it is +treated as part of that comment line and is also discarded. + +=back + +Once that basic textual preprocessing has been performed, the main XS +parsing takes place. XS syntax is very line-orientated. XS lines and +sections mostly start with a keyword of the form; + + /^\s*[A-Z_]+:/ + +It is best to position file-scoped keywords at column one, while +XSUB-scoped keywords are best indented. This may avoid surprises with edge +cases in the XS parser. + +Keywords can be either single line, e.g. C, or +multi-line. The latter consume lines until the next keyword, or until the +possible start of a new XSUB (C), or to EOF. Multi-line keywords +treat the rest of the text on the line which follows the keyword as the +first line of data. The exception to this is keywords which introduce a +block of code, such as C or C, which silently ignore the +rest of the first line. (Yes, this is a implementation flaw.) + +It is best to include a blank line between each file-scoped item, and +before the start of each XSUB. While some items I processed correctly +if they are on the line immediately preceding the start of an XSUB, the +parser is inconsistent in their handling. + +An XSUB ends when C is encountered: i.e. a blank line followed +by something on column one. (This is why it's recommended to indent +XSUB-scoped keywords.) If the thing at column one matches any of the items +which can appear in between XSUBs (such as file-scoped keywords) then it, +and any subsequent lines, are processed as such. Anything starting on +column one which isn't otherwise recognised, is interpreted as the first +line of the next XSUB definition. In particular it is interpreted as the +return type of the XSUB: this can lead to weird errors when something is +unexpectedly interpreted as the start of a new XSUB, such as C, +which isn't valid in the XS half of the file apart from within code +blocks. + +Some multi-line keywords, such as C, are treated as just a single +uninterpreted multi-line string. Others, such as C, have a +specific per-line syntax, where each line within the section is parsed. +Finally, code blocks such as C are just copied as-is to the output C +file (possibly sandwiched between C<#line> directives to ensure that +compiler error messages report from the correct location). + +The XS parser doesn't recognise C comments, so don't use them apart from +in C code (e.g. not in an XSUB signature). More generally, the XS parser +doesn't understand C syntax or semantics; it just uses crude regexes to +parse the XS file. For example in an XSUB declaration like + + int + foo(int a, char *b = "),") + +the parser just extracts out everything between (...) and splits on +commas, with just enough intelligence to ignore commas etc within matching +pairs of double-quotes. The parser doesn't understand C type declaration +syntax; for example it typically just extracts everything before what +appears to be a parameter name, and assumes that it must be the type. That +"type" will later be looked up in a typemap, and if no entry is found, +will only then raise an error. + +In addition, the XS parser has historically been very permissive, even to +the point of accepting nonsense as input. Since Perl 5.44, more things are +likely to warn or raise errors during XS parsing, rather than silently +generating a non-compilable C code file. + +As mentioned earlier, an XSUB definition typically starts with C +and continues until the next C. The XSUB definition consists of +a declaration, followed by an optional body. The declaration gives the +function's name, parameters and return type, and is intended to mimic a C +function declaration. It is usually two lines long. + +The XSUB's body consists of a series of keywords. The main C code of an +XSUB is specified by a C or C section. In the absence of +this, a short body is generated automatically, which consists of a call to +a C function with the same name and arguments as the XSUB. In this way, +the XSUB becomes a short wrapper function between Perl and the C library +function, with the wrapper handling the conversion been Perl and C +arguments. This is referred to in this document as I. + +Other keywords can be used to modify the code generated for the XSUB, or +to alter how it is registered with the interpreter (e.g. adding +attributes). + +So that is the basic structure of an XSUB. What a real XSUB looks like +will be covered later in L, but first a slight +digression follows. + +=head2 Overview of how data is passed to and from an XSUB + +This section contains a basic background on how XSUBs are invoked, what +their arguments consist of, and how XSUB arguments are passed to and from +Perl. It is essentially a summary of some relevant sections within +L; see that document for a more detailed exploration. + +Note that most of the information in this section isn't needed to create +basic XSUBs; but for more complex needs or for debugging, it helps to +understand what's happening behind the scenes. + +=head3 Perl OPs + +An C is is a data structure within the perl interpreter. It is used to +hold the nodes within a tree structure created when the perl source is +compiled. It usually represents a single operation within the perl source, +such as an add, or a function call. The structure has various flags and +data, and a pointer to a C function (called the PP function) which is used +to implement the actions of that OP. The main loop of the perl interpreter +consists of calling the PP function associated with the current OP +(C) and then updating it, typically to C<< PL_op->op_next >>. In +particular, the C performs (or at least starts) a function +call. + +=head3 SVs and the Perl interpreter's argument stack + +Almost all runtime data within the Perl interpreter, including all Perl +variables, are stored in an I structure. These SVs can hold data of +many different types, including integers (IV - integer value), strings (PV +- pointer value), references (RV), arrays (AV), elements of arrays, +subroutines (CV - code value) etc. These will be discussed in more detail +below. + +Perl has an I stack, which is a C array of SV pointers. Most of +the run-time actions of the PP functions consist of pushing SV pointers +onto the stack or popping them off and processing them. There is a +companion I stack, which is an array of integers which are argument +stack offsets. These marks serve to delineate the stack into frames. + +Consider this subroutine call: + + @a = foo(1, $x); + +The various OPs executed by the Perl interpreter up until the function is +called will: push a mark indicating the start of a new argument stack +frame; push an SV containing the integer value 1; push the SV currently +bound to the variable C<$x>; and push the C<*foo> typeglob. Then the +PP function C associated with the C will pop +that typeglob, extract the C<&foo> CV from it, and see whether it is a +normal CV or an XSUB CV. + +For a normal Perl subroutine call, C will then: pop the +topmost mark off the mark stack; pop the SV pointers between that mark and +the top of the stack and store them in C<@_>; then set C to the +first OP pointed to by the C<&foo> CV. Those OPs will then be run by the +main loop, until the OPs associated with the last statement of the +function (or an explicit return) will leave any return values as SV +pointers on the stack. + +For an XSUB sub, C will instead note the value of the +topmost mark (but not pop it) and call the C function pointed to from the +CV; this is the XSUB which has been generated by the XS parser. The XSUB +itself is responsible for popping the mark stack, doing any processing of +its arguments on the stack, and then pushing return values. But note that +for straightforward XSUBs, this is usually all done by boilerplate code +generated by the XS parser. Exactly what is done automatically and what +can be overridden and handled manually if needed, is one of the themes of +this document. Finally, C will do any post-processing of +the returned values; for example discarding all but the top-most stack +item if the function call was in scalar context. + +=head3 An SV's reference count + +Perl uses reference counting as its garbage collection method. One of the +always-present fields in an SV is its reference count, accessible as +C. + +Usually an SV's reference count is incremented each time a pointer to the +SV is stored somewhere, and decremented any time such a pointer is +removed. When the reference count reaches zero, any destructor associated +with that SV is called, then the SV is freed. Mismanaging reference counts +can lead to SVs leaking or being prematurely freed. + +When relying on XS to generate all the boilerplate code, reference count +bookkeeping is usually handled for you automatically. Once you start +handling this yourself, then there are some specific considerations. + +Functions which create a new SV, such as as C, return an SV +that has an initial C of I. This is actually one too +high, since there are not yet any pointers to this SV stored anywhere. The +expectation is that the SV will shortly be embedded somewhere - such as +stored in an array - which will take "ownership" of that one count. If the +program calls C or similar before the new SV has been embedded, +then it will leak. Note that C can be trapped by C, so +it's possible that C could be called many times, leaking each +time. Note also that many things may indirectly trigger a C. For +example accessing the value of an SV associated with a tied variable may +trigger a call to its C method, which could call C. So a new +SV needs to be embedded quickly. + +Since such new SVs already have a reference count of one, when embedding +them it should be done in a way which doesn't increase its reference +count. For example, this modifies C to be a reference to a +newly-created SV holding an integer value, i.e. the perl equivalent of +C<$sv =\99>: + + sv_setrv_noinc(sv, newSViv(99)); + +The C<_noinc> variant is used here as it doesn't increment the reference +count of the integer-valued SV when creating a reference to it. + +Where appropriate, reference counts can be adjusted with C +and C and their variants. + +An exception to this system is the argument stack. Pointers on the +argument stack to SVs do I contribute to the reference count of that +SV. The code typically generated by XS takes advantage of this. For +example when ready to return a single value, the XSUB just stores a new SV +pointer at the base of the current stack frame, overwriting the old value, +then resets the argument stack pointer to the base of the frame plus one, +and returns. All the original values on the stack are discarded, without +adjusting any reference counts. + +This can be a problem if the XSUB is returning a new SV. Since this SV +isn't embedded anywhere apart from on the stack (which doesn't hold a +reference count to it), then if the code croaks, the SV on the stack will +leak. To avoid this, there is a separate I stack in the Perl +interpreter. Items on this stack I reference counted. Typically the +temps stack is reset at start of each statement, back to some particular +level. Each SV above this level has its reference count decremented. +Putting an SV on the temps stack is referred to as I it. It +is common to create a new SV and mortalise it at the same time: here are +some examples: + + SV *sv_99 = sv_2mortal(newSViv(99)); + SV *sv_abc = newSVpvn_flags("abc", 3, SVs_TEMP); + +Many OPs have an SV attached to them called a I. This SV has a +lifetime which is the same as the sub which the OP is a part of, and +usually has a reference count of one. It is used by many OPs to avoid +having to create (and later free) a temporary SV to return a value. For +example the C op in C<$a + $b> typically extracts the integer values +of its two arguments, calculates its sum, sets its C to that value +and pushes it onto the stack. The C which typically invokes +an XSUB usually has a C attached to it, and when returning a +value, the XSUB's boilerplate code generated by XS will usually try to use +it to return the value, rather than creating a fresh mortal SV on each +call. + +Note that there is a I perl interpreter build option, +C, under which the argument stack I reference counted, +but that is currently beyond the scope of this document. + +=head3 The IV, NV etc types + +An C (Integer Value) is a typedef in the perl interpreter's header +files that maps to a C integer. The exact integer type and size will +depend on the build configuration of the interpreter. It is guaranteed to +be large enough to hold a pointer. A C is the same but unsigned. An +C (numeric value) is a floating-point value; usually a C. +These types are used widely within the perl interpreter. + +A PV (pointer value) "type" is often used informally within documentation +and within the names of structure fields etc to refer to a string pointer +(C), but it is not actually a declared type. Similarly, RV +(reference value) is informally a pointer to another SV. + +There are also C and C, which are large enough to hold +signed and unsigned integer values representing the number of items in a C +array. C is used specifically for variables which store the number +of characters in a string (it is typically just an alias for C). + +=head3 The SV scalar value structure + +As mentioned above, almost runtime data within the perl interpreter is +stored in an SV (scalar value) structure. The head of an SV structure +consists of three or four words: a reference count; a type and flags; a +pointer to a body; and since perl 5.10.0, a general payload field. There +are around 17 types, and the type indicates what body (if any) is pointed +to from the SV's head. The body type only indicates what I of data +the body is capable of holding; the actual "type" of the SV (IV, NV, PV, +RV etc) is mostly indicated by what flags are set. + +Simple SVs may not have a body. Undefined values typically don't have one. +Also, some IV, NV, and RV values are stored directly in the payload field, +with the body pointer pointing back to the head with a suitable offset so +that the SV appears to have a body with an IV or whatever at a suitable +offset. + +For SVs which have a body, the payload field in the head is usually used +to store one common value which would otherwise have to be stored in the +body and require a further pointer indirection to access. For example, the +C pointer of a perl string SV is stored in the head, while the +length is stored in the body. + +The fields of an SV (both in the head and in the body) are usually +accessed via macros, which has allowed various rearrangements of the head +and body fields over the years while maintaining backwards compatibility. +Always use the macros. For example, C directly accesses the IV +field of the SV (which may be in the head or body depending on the SV's +type). If the SV has a valid integer value, then the C flag will +be set, which can be tested with the macro C. + +The body of an SV may be upgraded to a "bigger" one during the SV's +lifetime, but it is not usually downgraded. For example, during the course +of executing this perl code: + + my $x; + $x = "1"; + $u = $x + 1; + undef $x; + +Initially the SV has no body and none of the C, C, +C, nor C flags are set, indicating that it has neither +an IV, NV, PV or RV value. The complete lack of those flags indicates an +undefined value. After the string is assigned to it, its body type is set +to C, and it is given the corresponding body. The string pointer +and length are stored in the body (or perhaps the pointer in the payload +word), and the C flag is set, indicating that the SV holds a +valid string value. + +When Perl wants to use that SV as an integer, it uses a macro like +C to return the integer value. Unlike the direct C +macro, this first checks C, and if not true, calls a function +which calculates the integer value from its current string value. The +effect of this call is to update the SV's type and body to C +which is capable of holding both a string I and integer value, and +then to set the C flag in addition to the C. + +Finally, the C frees the string and turns off the C and +C flags, but leaves the body type as C. (Hence why an +SV's current Perl-level type should be determined by its flags, not its +body type.) + +Note that you should never directly access fields using macros like +C (the C implies direct) I you have just tested for +C. In general, always use macros such as C, which will do +any checking and conversion for you. + +There is a further complication with SVs: they can have one or more items +of I attached to them. These are small payloads, along with a +pointer to a jump table of pointers to functions with get/set etc actions. +They are used to implement things like C<$1>, C<$.> and tied variables. +The idea is that macros like C will first check whether the SV has +I magic (using C); and if so call its get method +first. For example, for a tied variable, this C-level get function will +call the perl-level C method and assign the return value of that +to the SV. Only then will C do its C check. + +When presented with an unknown SV, it should always have its magic checked +before examining the values of the SVs flags. + +In total, the C macro does roughly the equivalent of: + + if (SvGMAGICAL(sv)) + mg_get(sv); /* do FETCH() etc; update the SV's value / flags */ + if (!SvIOK(sv)) + sv_2iv(sv); /* convert undef to 0, "1" to 1 etc */ + return SvIVX(sv); /* use the raw value */ + +You will see soon that XS's typemap templates mostly use high-level macros +like C, so this is usually all handled automatically for you. Only +if you start to do your own type conversions will you need to worry about +these details. + +Forgetting to test for, and to call, C magic will typically appear to +work fine until the first time someone passes a tied variable or similar +to your XSUB, and C doesn't get called. Accessing fields with +C etc without testing for C first may access a field in +a body which doesn't exist and possibly trigger a SEGV. + +Magic should only be called once per "use"; for example if a tied scalar +is passed as an argument to your XSUB, you would expect C to only +be called once. Normally this is easy because you (or the typemap code) +does a single C call. Occasionally you may have explicitly called +C first, perhaps in order to check some flags; if so, you can +skip a second magic call with variants like C. For example: + + SvGETMAGIC(sv); /* this calls mg_get() if SvGMAGICAL() */ + if (SvNOK(sv)) + /* special-case: do something with a floating-point value */ + else { + IV i = SvIV_nomg(sv); + /* fall-back to treating it as an integer value */ + } + + +A Perl reference is just another type of scalar. It is indicated by +C being true, and the pointer to the referent SV is accessed +using C. + +The equivalent of C for strings is + + STRLEN len; + char *pv = SvPV(sv, len); + +(and variants) which both retrieves a string pointer and sets C to +its length. Note that there is no guarantee that after this call +C is true, nor that C. For example, C may +be a reference to a blessed object with an overloaded stringify (C<"">) +method. In which case, behind the scenes there may be a temporary SV +containing the result of the call to the method, with C pointing to +I SV's string buffer. C remains a reference. Similarly, a +non-overloaded reference to an array may return a temporary string like +C<"ARRAY(0x12345678)">. + +If you need to coerce an SV to a string (e.g. before modifying its string +value) then use C or one of its variants. For example if +used on an array reference, the SV will be converted from a reference into +a plain string SV with an C value of C<"ARRAY(0x12345678)">, and +the array's reference count decremented. + +Once an SV has been coerced into a PV (C is true), then +C represents the size of the allocated buffer, while +C represents the current length (in bytes) of the string. Note +that with Unicode, C may not necessarily equal the value +returned by the Perl built-in C, which is the length in +I. That can be obtained using the C function. +See L below for more details. + +The SV structure can also be used to store things which aren't simple +scalar values: in particular, arrays, hashes and code values. There are +typedefs for AV, HV and CV structures (plus a few others). These +structures are identical to SVs and can generally be used interchangeably +with suitable casting, e.g. C. The main feature of +these non-scalar SVs is that the value of the type field in these cases, +C, C, C etc, actually I indicate the +Perl type, rather than just indicating what sort of body they have. + +An important thing to note is that AVs and HVs are never directly pushed +onto the stack when calling and returning from subroutines and XSUBs. +Instead where necessary, references to them are pushed. You will likely +first spot such an error when you start getting "Bizarre copy of ..." +error messages. + +=head3 Unicode and UTF-8 + +A simple Perl string SV uses what is sometimes referred to as byte +encoding: each character is represented using a single byte. But when a +Perl string contains code points >= 0x100, each character of the string is +stored as a variable number of bytes using the UTF-8 encoding scheme, with +the C flag being set to indicate this. Other strings may or +may not be using UTF-8 encoding, depending on the history of the string. +For example, with: + + my $s = "A\x80"; + $s .= "\x{100}"; + chop $s; + +the string starts off in byte encoding, with C and with each byte representing one character. When +the extra character is appended, the string gets upgraded to UTF-8, with +C and the second and third +characters each using two bytes of storage. Once the third character is +removed, the string stays in UTF-8 encoding, with C and the second character using two bytes. So such a +string SV when passed to an XSUB has two possible representations; and +which will be used is somewhat unpredictable. + +Unfortunately XS currently has no support for UTF-8. All the standard +typemap entries, such as C, assume that the buffer of a string SV +is just an array of bytes to be manipulated by the XSUB or passed on +uninterpreted to a C function. If it is necessary for the XSUB to control +the UTF-8 status of an argument, then it is best to declare the parameter +as type C and do your own manipulation of it. Similarly for returning +string values. + +An SV's string representation can be forced to bytes using C +and variants; if the string contains any characters not representable in +a single byte, then that call croaks with a C error. +Conversely, C and variants will force the string to UTF-8. + +See L for more details. + +=head2 The Anatomy of an XSUB + +The previous section has explained how arguments are pushed onto the +stack, what those arguments look like, and how XSUBs are called. We will +now look at what happens I an XSUB function once called; in +particular, how it retrieves values from its arguments on the stack and +later returns a value or values on the stack; and how XS and typemaps +automate most of this. + +This section will provide both an overview of what an XSUB looks like in +XS, I what sort of C code is generated for it. The majority of the +rest of this document will then describe in more detail the various parts +of an XSUB mentioned here. Note that the various keywords within an XSUB's +definition usually correspond closely (and in the same order) to what C +code is generated for the XSUB. Most of the boilerplate code generated for +an XSUB is concerned with getting argument values off the stack at the +start, then returning zero or one result values on the stack at the end. + +A typical XSUB definition might look like: + + MODULE = Foo::Bar PACKAGE = Foo::Bar + + short + baz(int a, char *b = "") + PREINIT: + long z = ...; + CODE: + ... do stuff ...; + RETVAL = some_function(a, b, z); + OUTPUT: + RETVAL + +The first two lines of an XSUB are its declaration, which must be preceded +by a blank line. It gives the XSUB's return type, its name, and its +parameters (including any default values). While it is modelled on C +syntax, it is actually XS syntax (so for example C isn't +recognised). The return type and name must both start on column one, +although The XS parser actually allows both to be on the same line, such +as + + short baz(...) + +This XSUB definition will be translated into a C function whose start may +look something like this (the exact details may vary across XS parser +releases): + + void + XS_Foo__Bar_baz(pTHX_ CV* cv) + { + dVAR; dXSARGS; + if (items < 1 || items > 2) + croak_xs_usage(cv, "a, b= \"\""); + +Note that the first line of the function is actually specified using a +macro such as C, but for explanatory purposes, what is +shown above is one possible expansion of that macro, depending on the Perl +version and XS configuration. + +The important thing to note is that the XSUB's arguments are I passed +as arguments of the C function; they are still on the Perl argument stack. +Nor is the XSUB's return value returned by the C function. + +The C function's name is based on the XSUB's name plus the current XS +package (with C). Apart from debugging, you don't generally need +to know this name. + +The function's parameters are the CV associated with this XSUB (i.e. +C<&Foo::Bar::baz>) and, on MULTIPLICITY/threaded builds, a pointer to the +current Perl interpreter context. You won't need to directly use these +most of the time. + +The first few lines of code in the C function are standard boilerplate +added to to all XSUBs. Note that the naming convention for Perl +interpreter macros is that ones starting with a C are declarations; they +go in places where a variable can be declared, and typically declare one +or more variables and possibly their initialisations. + +C pops one index off the mark stack and sets up some auto +variables to allow the arguments on the stack to be accessed: +specifically, the variable C is declared, which indicates how many +arguments were passed, and some hidden variables are also declared which +are used by the macro C to retrieve a pointer to argument C from +the stack (counting from 0). The stack pointer is not actually decremented +yet. + +For a generic list-processing XSUB, these argument-accessing variables and +macros may be used directly. But more commonly, for an XSUB which has a +fixed signature (as in the example above), the parser will declare an auto +C variable for each parameter, and (using the system or a user typemap) +assign them values extracted from C etc. It will also declare a +variable called C with the XSUB's return type (unless that is +C), which is typically assigned to by the coder and then whose value +is automatically returned. Continuing the example above, the generated +code for the input part of the XSUB is similar to: + + { + long z = ...; + short RETVAL; + int a = (int)SvIV(ST(0)); + char *b; + + if (items < 2) + b = ""; + else + b = (char *)SvPV_nolen(ST(1)); + +This consists of declarations for C, C, C and C, plus +code to initialise them. The part of the code which extracts a value from +an SV on the stack, such as C<(int)SvIV(ST(0))>, is derived from a typemap +entry. For a simple entry such the one for C, the code may be added as +part of the declaration of the variable itself; otherwise the +initialisation may be done as a separate statement after all the variable +declarations (such as for C). + +Variable declarations appear in the order they appear in C and +C blocks, followed by C and then any parameters defined +completely within the signature (i.e. which don't use an C section +to specify their type). + +Perls before 5.36 used C89 compiler semantics, which didn't allow variable +declarations after statements. To work round this, the C keyword +allows you to inject additional variable declaration code. + +Following on from the input part, the main body of the function is output; +this is copied exactly as-is from the C or C section, if +present. If neither is present, the parser will assume that this XSUB is +just wrapping a C library function of the same name as the XSUB, and will +automatically generate some code like the following: + + RETVAL = baz(a, b); + +The C and C keywords may be used to add code just before +and after the main code; typically only useful for autocall. + +C is the same as C except that after argument processing, +the stack pointer is reset to the base of the frame, and the coder becomes +responsible for pushing any return values onto the stack. No further +keywords can follow C. This is typically used for XSUBs which +need to return a list or have other complex requirements beyond just +returning a single value. + +For C and autocall, unless the return type is void, the parser will +generate code to return the value of C. This is automatic in the +case of autocall, but for C you have to ask the parser to do so +with C. The code generated in either case may look +something like + + { + SV *RETVALSV = sv_newmortal(); + sv_setiv(RETVALSV, (IV)RETVAL); + ST(0) = RETVALSV; + } + +A temporary SV will be created, set to the value of C (again, +using a typemap template), then placed on the stack. In practice, various +optimisations may be used; in particular, the C target SV which is +attached to the calling C may be used instead of allocating +and freeing an SV for each call, as explained earlier. + +XSUB parameters declared as C or C will cause additional +output code to be generated which respectively: updates the value of one +of the passed arguments; or pushes the value of that parameter onto the +stack (in addition to C). + +Finally, (apart from C), a macro like this is added to the end of +the C function: + + XSRETURN(1); + +This resets the stack pointer to one above the base of the frame (so the +top item on the stack is C), then does C. + +For a C XSUB, C is used instead. + +=head2 Returning Values from an XSUB + +An XSUB's declared return type is typically a C type such as C or +C. XS is very good at automating this common case of returning a +single C-ish value: behind the scenes it creates a temporary SV; then, +using an appropriate typemap template, sets that SV to the value of +C and returns that SV on the stack. + +But sometimes you want to return a Perl-ish value rather than a C-ish +value, for example, Perl's undef value or a Perl array reference. Or you +may want to return multiple values, or update one of the passed +arguments. The following subsections describe various such cases. + +Note that XSUBs are somewhat like Perl lvalue subs, in that they return +the actual SV to the caller, while normal Perl subs return a temporary +copy of each return value. When returning a C value like C this +doesn't matter, since the XSUB is returning a temporary SV anyway; but +when returning your own SV, it could in theory make a visible difference. +For example, + + sub foo { $_[0]++ } + foo(an_xsub_which_returns_element_0_of_an_array(\@a)); + +would increment C<$a[0]>. + +=head3 Returning undef / TRUE / FALSE / empty list + +Sometimes you need to return an undefined value, e.g. to indicate failure. +It's possible to return early from a CODE block with an undefined value, +bypassing the normal creation of a temporary SV and the setting of its +value. For example: + + int + file_size(char *filename) + CODE: + RETVAL = file_size(filename); + if (RETVAL == -1) + XSRETURN_UNDEF; + OUTPUT: + RETVAL + +The C macro causes the address of the special Perl SV +C to be stored at C (this is the same value that the +Perl function C returns), and then to immediately return. + +If using autocall, then you can instead return early in a C +section: + + int + file_size(char *filename) + POSTCALL: + if (RETVAL == -1) + XSRETURN_UNDEF; + +There are similar macros + + XSRETURN_YES + XSRETURN_NO + XSRETURN_EMPTY + +which allow you to return Perl's true and false values, or to return +an empty list. + +If your XSUB will always explicitly return a special SV and won't ever +require typemap conversions (e.g. it always returns via C or +C), then just declare the return type as C. + +Note that any early return from an XSUB should always be via one of the +C macros and not directly via C; the former will do any +bookkeeping associated with the argument stack. + +=head3 Returning an SV* + +More generally, you may want to create and return an SV yourself, rather +than relying on the boilerplate XSUB code to generate a temporary SV and +set it to a C-ish value. Here you would declare the return type as C. +For example: + + SV* + abc(bool uc) + CODE: + RETVAL = newSVpv(uc ? "ABC" : "abc", 3); + OUTPUT: + RETVAL + +There is some special processing which happens when using a return type +such as C. First, consider that for a C return type like C, the +typemap template which sets the temporary SV's value may look something +like: + + sv_setiv($arg, (IV)$var); + +which after expansion may look like: + + sv_setiv(RETVALSV, (IV)RETVAL); + +where the temporary SV has previously been assigned to C. + +Now, if you declare an XSUB with a return type of C, you I +expect the typemap template to look something like: + + sv_setsv($arg, (SV*)$var); + +This Perl library function copies the value of one SV to another (the +XS user's equivalent of the Perl C<$a = $b>). + +However, the design decision was made that for the C type in +particular, the typemap template would be + + $arg = $var; + +Here is where the special processing comes in. The XS compiler, in the +case of an output template beginning C<$arg = ...>, skips creating a +temporary SV, and just returns the SV in C directly. So the +typemap template would be expanded to + + ST(0) = RETVAL; + +This is faster than copying. + +But in addition, for I C<$arg = ...> template (not just the template +for C), the XS compiler makes one further assumption: that the +expression to the right of the assign evaluates to an SV with a reference +count count I, and so in addition, the XS compiler emits: + + sv_2mortal(RETVAL); + +or similar, which causes the reference count of the SV to be decremented +by one at (typically) the start of the next statement. This makes sense if +the SV is newly created with one of the C family of functions: +see the discussion on this in L + +However, if the SV comes from elsewhere, for example via a Perl array +lookup, then its reference count doesn't need to be adjusted, and so the +mortalising will cause it to be prematurely freed. In this case, you need +to artificially increase the SV's reference count. + +The previous example showed creating a new SV using C; here's an +example where the SV pre-exists in an array: + + SV* + lookup(int i) + CODE: + { + SV** svp = av_fetch(some_array_AV, i, 0); + if (!svp) + XSRETURN_UNDEF; + /* compensate for the implicit mortalisation */ + RETVAL = SvREFCNT_inc(*svp); + } + OUTPUT: + RETVAL + +Finally, note that some very old (pre-1996) XS documentation suggested +that you could return your own SV using code like: + + void + foo(...) + CODE: + ST(0) = some_SV; + +This is very wrong, as the C declaration tells the XS code to expect +to return I items on the stack. There is still come code like this +in the wild, and to work around it, the XS compiler does a very special +and ugly hack for a C XSUB when it sees C being assigned to +within a C block: it pretends that the XSUB was actually declared as +returning C and so emits C rather than +C. But don't rely on this: it is likely to warn +eventually. If your XSUB is doing its own setting of C, then always +declare the return type as C. + +The mark stack isn't used when returning arguments; instead, the caller of +the XSUB (usually the C) notes the offset of the base of +the argument stack frame before calling the XSUB and the offset of the +stack pointer on return, and can deduce the number of returned arguments +from that. + +=head3 Returning AV* etc refs + +Sometimes you want to return a non-scalar SV, such as an AV, HV or CV. +However, these aren't allowed directly on the argument stack. You are +supposed instead to return a I to the AV: a bit like a Perl sub +returning C<\@foo>. + +The standard typemaps can create this reference for you automatically. So +for example an XSUB with a return type of C will actually create and +return an RV scalar which references the AV in C. So the XS +equivalent of Perl's C might be: + + AV * + array89() + CODE: + RETVAL = newAV(); + /* see text below for why this line is needed */ + sv_2mortal((SV*)RETVAL); + av_store(RETVAL, 0, newSViv(8)); + av_store(RETVAL, 1, newSViv(9)); + OUTPUT: + RETVAL + +Note that the C variable is declared as type C, but what is +actually returned to the caller is a temporary SV which is a reference to +C. The standard output typemap template for the C type looks +like: + + $arg = newRV((SV*)$var); + +This means it creates a new RV which refers to to the AV. Because of the +rule for C<$arg = ...> typemaps, the RV will be correctly mortalised +before being returned. However, the C function increments the +reference count of the thing being referred to (the C AV in this +case). Since the AV has just been created by C with a reference +count one too high, it will leak. This why the C is +required. Conversely for a pre-existing AV, the mortalisation isn't +required. + +Since perl 5.16, there are a set of alternative XS types which can be used +for AVs etc which I increment the reference count of the AV when +being pointed to from the new RV. These can be enabled by mapping the +C etc C types to these new XS types: + + TYPEMAP: < and +handle the RV generation yourself: + + SV * + create_array_ref() + CODE: + RETVAL = newRV_noinc((SV*)newAV()); + OUTPUT: + RETVAL + +If instead you want to return a flattened array (the equivalent of Perl's +C) then you would have to push the elements of the array +individually onto the stack in a C block. See L below. + +Finally, the C C type in the standard typemap is a way of creating +and returning a reference to a scalar (as opposed to the C type, +which just returns a scalar). In this case you have to tell the C compiler +that C is just another name for C: + + typedef SV *SVREF; + +Then in an XSUB like + + SVREF + foo() + CODE: + RETVAL = newSViv(9); + sv_2mortal(RETVAL); + OUTPUT: + RETVAL + +C will be declared with type C, and the XSUB will return a +reference to an integer: the perl equivalent of C. + +=head3 Updating arguments and returning multiple values. + +By using the C and similar parameter modifiers, XS provides +limited support for returning extra values in addition to (or instead of) +C, either by updating the values of passed arguments (C), or +by returning some of the parameters (and pseudo-parameters) as extra +return values (C). For returning an arbitrary list of values, see +the next section. + +Here are a couple of simple XS examples with their approximate perl +equivalents: + + # Update a passed argument + + void sub inc9 { + inc9(IN_OUT int i) my $i = $_[0]; + CODE: $i += 9; + i += 9 $_[0] = $i; + } + + # Return (2*$i, 3*$i) + + void sub mul23 { + mul23(int i, \ my $i = $_[0]; + OUTLIST int x, \ my ($x, $y); + OUTLIST int y) $x = $i * 2; + CODE $y = $i * 3;: + x = i * 2; return $x, $y; + y = i * 3; } + +See L +for the full details, + +=head3 Returning a list + +If you want to return a list, i.e. an arbitrary number of items on the +stack, you generally have to forgo the convenience of some of the +boilerplate code generated by XS, which is biased towards returning a +single value. Instead you will have to create and push the SVs yourself. +The L keyword is specifically intended for +this purpose. Here is a simple example which does the same as the +Perl-level C: + + void + one_to_n(int n) + PPCODE: + { + int i; + if (n < 1) + Perl_croak_nocontext( + "one_to_n(): argument %d must be >= 1", n); + EXTEND(SP, n); + for (i = 1; i <= n; i++) + mPUSHi(i); + } + +The C keyword causes the argument stack pointer to be initially +reset to the base of the frame (discarding any passed arguments), and +suppresses any automatic return code generation. The return type of the +XSUB is ignored, except that declaring it C suppresses the +declaration of a C variable. + +The C macro makes sure that there are at least that many free +slots on the stack (its first argument should always be C). The +C macro creates a new SV, mortalises it, sets its value to the +integer C, and pushes it on the stack. + +Here's another example, which flattens the array passed as an argument: +the equivalent of this Perl: + + sub flatten { my $aref = $_[0]; @$aref: } + +In this example, the SVs being pushed aren't freshly created with a +reference count one too high, so don't need mortalising. + + void + flatten(AV *av) + PPCODE: + { + int i; + int max_ix = AvFILL(av); + SV **svp; + EXTEND(SP, max_ix + 1); + for (i = 0; i <= max_ix; i++) { + svp = av_fetch(av, i, 0); + PUSHs(svp ? *svp : &PL_sv_undef); + } + } + +This function actually expects to be passed a I to an array: +the input typemap entry for C automatically takes care of +dereferencing the argument and croaking if it's not actually a reference. +The C macro simply pushes an SV onto the stack, without any +mortalising or copying. Any "holes" in the array are filled with undefs. + +=head2 Bootstrapping + +In addition to the C C function generated for each XSUB +declaration, a C C function is also automatically +generated, one for each XS file. This XSUB function is called once when +the module is first loaded. For each declared XSUB in the file, a line +similar to the following is added to the boot function: + + newXS("Foo::Bar::baz", XS_Foo__Bar_baz); + +(the exact details of the code will vary across releases and +configurations). This call creates a CV, flags it as being an XSUB, adds +a pointer from it to C, then adds the CV to the +C<*FOO::Bar::baz> typeglob in the Perl interpreter's symbol table. It is +the XS equivalent of the Perl-level + + *FOO::Bar::baz = sub { ... } + +For some XSUBs, additional lines may be added by the parser to the boot +XSUB to handle things like aliases or overloading. + +You can add your own additional lines to the boot XSUB using the C +keyword. + +A typical Perl module like F should have code in it similar +to: + + package Foo::Bar; + our $VERSION = '1.01'; + require XSLoader; + XSLoader::load(__PACKAGE__, $VERSION); + +This causes the F or F file to be dynamically linked in +and then the C function called. This boilerplate code is +typically created automatically with F when you first create the +skeleton of a new distribution. See L for more details. + +=head1 REFERENCE MANUAL + +This part of the document explains what each XS keyword does. They are +arranged in the approximate order in which they might appear within an XS +file, and then might appear within an XSUB declaration. Related keywords +are grouped together. + +=head2 The MODULE Declaration + + MODULE = Foo::Bar PACKAGE = Foo::Bar + MODULE = Foo::Bar PACKAGE = Foo::Bar::Baz + MODULE = Foo::Bar PACKAGE = Foo::Bar PREFIX = foobar_ + +The C keyword is used to start the XS half of the file, and to +specify the package of the functions which are being defined. The +C keyword must start on column one. All text preceding the first +C keyword is considered C code and is passed through to the output +with POD stripped, but otherwise untouched. + +It is usually necessary to include a blank line before each MODULE +declaration. + +For the first such declaration, the C and C values are +typically the same. In subsequent entries, the C value varies, +while the C value is kept unchanged. In fact, only the C +value from the I such declaration is used, and specifies the name of +the boot XSUB which is called when the module is loaded (typically via +C). + +The value of the C keyword is analogous to the Perl C +keyword, and determines which package any subsequent XSUBs will be created +in. It is permissible to have the same C value appear more than +once, again similarly to Perl. + +In theory the C keyword is optional, and defaults to C<''>. This +means that any subsequent XSUBs will be placed in the C package. +In practice, you should always specify the package. + +The optional C value is stripped from the XSUB's name when +generating the XSUB's Perl name. It is typically used to simplify creating +autocall XSUBs. It addresses the issue that while Perl has package names, +C only has function name prefixes. Consider a C library called C, +which has functions such as C and C. We +want to make these accessible from a Perl module called C. In +the presence of C, any such prefix of each XSUB +name will be stripped off when determining the XSUB's Perl name. For +example: + + MODULE = Foo::Bar PACKAGE = Foo::Bar PREFIX = foobar_ + + char* foobar_read(int n) + + int foobar_write(char *text, int n) + +This will insert two XSUBs into the Perl namespace, called +C and C, which when called, will +themselves call the C functions C and C. + +=head2 File-scoped XS Keywords and Directives + +After the first C keyword, everything else in the file consists of +XSUB definitions, plus anything that comes between the XSUBs. The XSUBs +will be explained further down, but this section addresses the in-between +stuff, which can consist of any of the following. + +=over + +=item * + +A few file-scoped keywords (including further MODULE declarations), whose +effects usually last for the rest of the file. These keywords will be +detailed further down in this section. + +=item * + +POD, which is stripped out. It must be terminated with C<=cut>. + +=item * + +Blank lines, which are discarded. + +=item * + +Known C C preprocessor directives, which are passed through as-is. +Conditional ones, such as C<#if> and C<#else>, have some basic analysis +performed on them which, in particular, allows two variants of the same +XSUB to be declared without raising a "duplicate XSUB" warning. This +warning suppression only works for the if/else/endif form. For example +this works: + + #ifdef USE_2ARG + + int foo(int a, int b) + + #else + + int foo(int a) + + #endif + +while this form will still raise warnings: + + #ifdef USE_2ARG + ... + #endif + + #ifndef USE_2ARG + ... + #endif + +=item * + +XS comment lines, which are stripped out; either a C which isn't +recognised as a C preprocessor directive, or C/. + +=item * + +Anything else is an error, unless it starts on column one, in which case +it will be treated as the start of a new XSUB. + +=back + +The following file-scoped keywords are supported. Note that the +L can technically be a +file-scoped keyword too, but is described further down as an XSUB keyword. + +=head3 The REQUIRE: Keyword + + REQUIRE: 3.58 + +The C keyword is used to indicate the minimum version of the +C XS compiler (and its F wrapper) needed to +compile the XS module. It is expected to be a floating-point number of the +form C<\d+\.\d+/>. It is analogous to the perl C. + +=head3 The VERSIONCHECK: Keyword + + VERSIONCHECK: DISABLE | ENABLE + +Version checking (enabled by default) checks that the version compiled +into the C<.so> or C<.dll> file matches the C<.pm> file's C<$VERSION> +value, and if not, dies with an error message like: + + Foo::Bar object version 1.03 does not match bootstrap parameter 1.04 + +Typically, when a module is built for the first time, the value of the +C<$VERSION> variable in the C<.pm> file is copied to the generated +C as C, and from there, via a C<-DXS_VERSION=...> +compiler option, is baked into the boot XSUB. When the module is loaded +and the boot code called, the versions are compared, and it croaks if +there's a mismatch. This usually indicates that the C<.so> and C<.pm> +files are from different installs: for example someone copied over a more +recent version of the C<.pm> file but forgot to copy or rebuild the +C<.so>. + +If the version of the PM module is a floating point number, it will be +stringified before the comparison, with a possible loss of precision +(currently chopping to nine decimal places), so it may not match the +version of the XS module any more. Quoting the C<$VERSION> declaration to +make it a string is recommended if long version numbers are used. + +There is rarely any good reason to disable this check. + +Note that this module version checking is completely unrelated to the +C keyword, which is a check against the version of the I. + +The C keyword corresponds to F's C<-versioncheck> +and C<-noversioncheck> options. This keyword overrides the command line +options. + +=head3 The PROTOTYPES: Keyword + + PROTOTYPES: DISABLE | ENABLE + +When prototypes are enabled (they are disabled by default), any +subsequent XSUBs will be given a Perl prototype. The prototype string is +usually generated from the XSUB's parameter list. This keyword may be used +multiple times in an XS module to enable and disable prototypes for +different parts of the module. + +For example, these two XSUB declarations: + + int add1(int a, int b) + + PROTOTYPES: ENABLE + + int add2(int a, int b) + +behave similarly to the perl-level: + + sub add1 { ... } + sub add2($$) { ... } + +Note also that prototypes can be overridden on a per-XSUB basis with the +XSUB-level L keyword. + +In general, XSUB prototypes (similarly to perl sub prototypes) are of very +limited use and are typically only used to mimic the behaviour of Perl +builtins. For example there is no way to implement a C +style function without a way of telling the Perl interpreter not to +flatten C<@a>. Outside of these narrow uses, it is generally a mistake to +use prototypes. + +In the early days of XS it was thought that using prototypes was probably +a Good Thing, and prototypes were enabled by default. This was soon +changed to disabled by default, and a warning was added if you haven't +explicitly indicated your preference: so in the absence of any +C keyword, you will get this nagging warning: + + Please specify prototyping behavior for Foo.xs (see perlxs manual) + +So 99% of the time you will want to add + + PROTOTYPES: DISABLE + +to the start of the XS half of your C<.xs> file. + +The C keyword corresponds to F's C<-prototypes> and +C<-noprototypes> options. + +See L for more information about Perl prototypes. + +=head3 The EXPORT_XSUB_SYMBOLS: Keyword + + EXPORT_XSUB_SYMBOLS: ENABLE | DISABLE + +This keyword is present since 5.16.0, and its value is disabled by +default. + +Before 5.16.0, the C function which implemented an XSUB was exported. +Since 5.16.0, it is declared C. The old behaviour can be restored +by enabling it. You are very unlikely to have a need for this keyword. + +=head3 The INCLUDE: Keyword + + INCLUDE: const-xs.inc + INCLUDE: some_command | + +This keyword can be used to pull in the contents of another file to the +"XS" part of an XS file. Unlike a top-level XS file, included files don't +have a "C" first half, and the entire contents of the file are treated as +XS, as if it had all been inserted at that line. + +One common use of C is to include constant definitions generated +by F. + +If the parameters to the C keyword are followed by a pipe (C<|>) +then the XS parser will interpret the parameters as a command. This +feature is mildly deprecated in favour of the C +directive, as documented below. The latter can be used to ensure that the +perl (if any) used in the command is the same as the one running the XS +parser. + +=head3 The INCLUDE_COMMAND: Keyword + + INCLUDE_COMMAND: $^X -e '...' + +Since 5.14.0. + +Similar to C except that the C<|> is implicit, and +it converts the special token C<$^X>, if present, to the path of the perl +interpreter which is running the XS parser. + +=head3 The TYPEMAP: Keyword + + TYPEMAP: < keyword can be used to embed typemap declarations +directly into your XS code, instead of (or in addition to) typemaps in a +separate file. Multiple such embedded typemaps will be processed in order +of appearance in the XS code. Typemaps are processed in the order: + +=over + +=item * + +The system typemap file. + +=item * + +A local typemap file, typically specified by C +in the F. + +=item * + +C entries, in order. + +=back + +The most recently-applied entries take precedence, so for example you can +use C to individually override specific C, C, or +C entries in the system typemap. In general, typemap changes +affect any subsequent XSUBs within the file, until further updates. Note +however that due a quirk in parsing, it is possible for a C +entry immediately I an XSUB to affect that XSUB. + +The C keyword syntax is intended to mimic Perl's "heredoc" +syntax, and the keyword must be followed by one of these three forms: + + << FOO + << 'FOO' + << "FOO" + +where C can be just about any sequence of characters, which must be +matched at the start of a subsequent line. + +See L and L for more details on writing +typemaps. + +=head3 The BOOT: Keyword + + BOOT: + # Print a message when the module is loaded + printf("Hello from the bootstrap!\n"); + +The C keyword is used to add code to the extension's bootstrap +function. This function is generated by the XS parser and normally holds +the statements necessary to register any XSUBs with Perl. It is usually +called once, at C time. + +This keyword should appear on a line by itself. All subsequent lines will +be interpreted as lines of C code to pass through, including C +preprocessor directives, but excluding POD and C<#> comments; until the +next keyword or possible start of a new XSUB (C). + +=head3 The FALLBACK: Keyword + + MODULE = Foo PACKAGE = Foo::Bar + + FALLBACK: TRUE | FALSE | UNDEF + +Since 5.8.1. + +It defaults to C for each package. It sets the default fallback +handling behaviour for overloaded methods in the current package (i.e. +C in the example above). It is analogous to the Perl-level: + + package Foo::Bar; + use overload "fallback" => 1 | 0 | undef; + +It only has any effect if there ends up being at least one XSUB in the +current package with the L keyword +present. See L for more details. + +=head2 The Structure of an XSUB + +Following any file-scoped XS keywords and directives, an XSUB may appear. +The start of an XSUB is usually indicated by a blank line followed by +something starting on column one which isn't otherwise recognised as an +XSUB keyword or file-scoped directive. + +An XSUB definition consists of a declaration (typically two lines), +followed by an optional body. The declaration specifies the XSUB's name, +parameters and return type. The body consists of sections started by +keywords, which may specify how its parameters and any any return value +should be processed, and what the main C code body of the XSUB consists +of. Other keywords can change the behaviour of the XSUB, or affect how it +is registered with Perl, e.g. with extra named aliases. In the absence of +an explicit main C code body specified by the C or C +keywords, the parser will generate a body automatically; this is referred +to as L in this document. + +Nothing can appear between keyword sections apart from POD, XS comments, +and trailing blank lines, all of which are stripped out before the main +parsing takes place. Anything else will either raise an error, or be +interpreted as the start of a new XSUB. + +An XSUB's body can be thought of as having up to five parts. These are, in +order of appearance, the L, L, L, L and L parts. There is no +formal syntax to define this structure; it's just an understanding that +certain keywords may only appear in certain parts and thus may only appear +after certain other keywords etc. + + +=head2 An XSUB Declaration + + # A simple declaration: + + int + foo1(int i, char *s) + + # All on one line; plus a default parameter value: + + int foo2(int i, char *s = "") + + # Complex parameters; plus variable argument count: + + int + foo3(OUT int i, IN_OUTLIST char *s, STRLEN length(s), ...) + + # No automatic argument processing: + + void + foo4(...) + PPCODE: + + # C++ method; plus various return type qualifiers: + + NO_OUTPUT extern "C" static int + X::Y::foo5(int i, char *s) const + + +An XSUB declaration consists of a return type, name, parameters, and +optional C, C, C and C keywords. + +=head3 An XSUB's return type and the NO_OUTPUT keyword + +The return type can be any valid C type, including C. When non-void, +it serves two purposes. First, it causes a C auto variable of that type +to be declared, called C. Second, it (usually) makes the XSUB +return a single SV whose value is set to C's value at the time of +return. In addition, a non-void autocall XSUB will call the underlying C +library function and assign its return value to C. + +In addition the return type can be a Perl package name; see +L for details. + +If the return type is prefixed with the C keyword, then the +C variable is still declared, but code to return its value is +suppressed. It is typically useful when making an autocall function +interface more Perl-like, especially when the C return value is just an +error condition indicator. For example, + + NO_OUTPUT int + delete_file(char *name) + # implicit autocall code here: RETVAL = delete_file(name); + POSTCALL: + if (RETVAL != 0) + croak("Error %d while deleting file '%s'", RETVAL, name); + +Here the generated XS function returns nothing on success, and will +C with a meaningful error message on error. The XSUB's return type +of C is only meaningful for declaring C and for doing the +autocall. + +The return type can also include the C and C +modifiers, which if present must be in that order, and come between any +C keyword and the return type. The C declaration must +be written exactly as shown, i.e. with a single space and with double +quotes around the C. These two modifiers are mainly of use for XSUBs +written in C++. A C++ XSUB declaration is also allowed to have a trailing +C keyword, which mimics the C++ syntax. See L +for more details. + +=head3 An XSUB's name + +The name of the XSUB is usually put on the line following the type, in +which case it must be on column one. It is permissible for both the return +type and name to be on the same line. + +The name can be any valid Perl subroutine name. The C value from +the most recent C declaration is used to give the XSUB it's +fully-qualified Perl name. + +If the name includes the package separator, C<::>, then it is treated as +as a C++ method declaration, and various extra bits of processing take +place, such as declaring an implicit C parameter. The XSUB's I +package name is still determined by the current XS package, and not the +C++ class name. See L for more details. + +=head3 An XSUB's parameter list + +Following the XSUB's name, there is a comma-separated list of parameters +within parentheses. Although this looks superficially the same as a C +function declaration, it is different. In particular, it is parsed by the +XS compiler, which is a simple regex-based text processor and which +doesn't understand the full C type syntax; nor does it recognise C-style +comments. + +In fact all it does is extract the text between the C<(...)> and split on +commas, while having enough intelligence to ignore commas and a closing +parenthesis within a double-quoted string. Once each parameter declaration +is extracted, it is processed, as described below in +L. + +Each parameter declaration usually generates a C auto variable declaration +of the same name, along with initialisation code which assigns the value +of the corresponding passed argument to that variable. Under some +circumstances code can also be generated to return the value too. + +Note that the original XS syntax required the type for each parameter to +be specified separately in one or more INPUT sections, mimicking pre-C89 +"K&R" C syntax. To support this, directly after the declaration there is an +implicit INPUT section, without a need to include the actual keyword. You +will see this pattern very frequently in older XS code. + +Old style with an implicit INPUT keyword (a common pattern): + + int + foo(a, b) + long a + char *b + CODE: + ... + +Old style with explicit INPUT keyword (unusual): + + int + foo(a, b) + INPUT: + long a + char *b + CODE: + ... + +New style (recommended for new code): + + int + foo(long a, char *b) + CODE: + ... + +Generally there no reason to use the old style any more, apart from a few +obscure features that can be specified on an INPUT line but not in the +signature. + + +=head2 An XSUB Parameter + +Some examples of valid XSUB parameter declarations: + + char *foo # parameter with type + Foo::Bar foo # parameter with Perl package type + char *foo = "abc" # default value + char *foo = NO_INIT # doesn't complain if arg missing + OUT char *foo # caller's arg gets updated + IN_OUTLIST char *foo # parameter value gets returned + int length(foo) # pseudo-parameter that gets the length of foo + foo # placeholder, or parameter without type + SV* # placeholder + ... # ellipsis: zero or more further arguments + +The most straightforward type of declaration in an XSUB's parameter list +consists of just a C type followed by a parameter name, such as C. This has two main effects. First, it causes a C auto variable of +that name to be declared; and second, the variable is initialised to the +value of the passed argument which corresponds to that parameter. For +example, + + void + foo(int i, char *s) + +is roughly equivalent to the Perl: + + sub foo { + my $i = int($_[0]); + my $s = "$_[1]"; + ... + } + +and the generated C code may look something like: + + if (items != 2) + croak_xs_usage(cv, "i, s"); + + { + int i = (int)SvIV(ST(0)); + char *s = (char *)SvPV_nolen(ST(1)); + foo(i, s); /* autocall */ + ... + } + +In addition to the variable declaration and initialisation, the name of +the parameter will usually be used in the usage message and in any +autocall, as shown above. These variables are accessible for any user code +in a C block or similar. Their values aren't normally returned. + +There are several variations on this basic pattern, which are explained in +the following subsections. + +=head3 Fully-qualified type names and Perl objects + + Foo::Bar + foo(Foo::Bar self, ...) + +Normally the type of an XUB's parameter or return value is a valid C type, +such as C<"char *">. However you can also use Perl package names. When a +type name includes a colon, it undergoes some extra processing; in +particular, the actual type as emitted into the C file is transformed +using C (unless F has been invoked with C<-hiertype>), so +that a legal C type is present. The complete effects for a type of +C are as follows. + +The type string C is looked up in the typemap I to find +the logical XS type; then the C and C typemap templates are +expanded with the C<$ntype> variable set to C<"Foo::Bar"> and the C<$type> +variable set to C<"Foo__Bar">. The declaration of the corresponding auto +variables uses the modified type string, so the example above might +result in these declarations in the C code: + + Foo__Bar RETVAL; + Foo__Bar self = ...; + +With the appropriate XS typemap entries and C typedefs, this can be used +to assist in declaring XSUBs which are passed and return Perl objects. +See L for an example of this using +the common C typemap type. + +=head3 XSUB Parameter Placeholders + +Sometimes you want to skip an argument. There are two supported techniques +for efficiently declaring a placeholder. Both of these will completely +skip any declaration and initialisation of a C auto variable, but will +still consume an argument. + +A bare parameter name is treated as a placeholder if has a name but no +type specified: neither in the signature, nor in any following C +section. For example: + + void + foo(int a, b, char *c) + CODE: + ... + +is roughly equivalent to the Perl: + + sub foo { + my $a = int($_[0]); + my $c = "$_[2]"; + ... + } + +A parameter containing just the specific type C and no name is +treated specially. A bug in the XS parser meant that it used to skip any +parameter declaration which wasn't parsable. This inadvertently made many +things de facto placeholder declarations. A common usage was C, which +is now officially treated as a placeholder for backwards compatibility. +Any other bare types without a parameter name are errors since +C 3.57. Note that the C text will appear in any +C error message. For example, + + void + foo(int a, SV*, char *c) + +may croak with: + + Usage: Foo::Bar::foo(a, SV*, c) at ... + +Placeholders can't be used with autocall unless you use C to +override the missing argument. For example: + + void + foo(int a, b, char *c) + C_ARGS: a, c + +=head3 Updating and returning parameter values: the IN_OUT etc keywords + + int i + IN int i + IN_OUT int i + IN_OUTLIST int i + OUT int i + OUTLIST int i + +Normally a parameter declaration causes a C auto variable of the same name +to be declared and initialised to the value of the corresponding passed +argument. These modifiers can make parameters also update or return +values, and can also cause the initialisation to be skipped. They come at +the start of a parameter declaration. + +These modifiers address the issue that, because a simple C function takes +a fixed number of I parameters and returns a I value, +the basic XSUB syntax has been designed to reflect that pattern. + +The usual way to make a more complex C function API is to pass pointers to +variables, which the C function will use to set or update the variables. +For example, a couple of hypothetical C functions might be called as: + + int time = ....; // an integer in the range 0..86399 + int hour, min, sec; + parse_time(time, &hour, &min, &sec); // set hour, min, sec + increment_time(&hour, &min, &sec); // update hour, min, sec + +The XS C etc modifiers allow you to write XSUBs which can wrap +such functions with autocall, and in general update passed arguments or +return multiple values. + +The rules of these parameter modifiers are: + +=over + +=item * + +All such parameters, regardless of the modifier, cause a C auto variable +of the same name to be declared. + +=item * + +In the absence of a modifier, it defaults to C. + +=item * + +The text of the modifier has up to two parts, separated by an underscore. +The input part comes first and can have the value C<''> or C, while +the output part can be one of C<''>, C, or C. + +=item * + +An input part of C (the default) causes that variable to be +initialised to the value of the corresponding passed argument. Otherwise +the initialisation is skipped: in particular, this means that an autocall +function will be passed a pointer to an uninitialised value: the +assumption being that the library function will set, but not use, that +value. + +=item * + +An output part of C<''> (the default) means nothing is done to return the +value of the auto variable. + +=item * + +An output part of C or C causes the value to be returned in +some fashion, and in addition, any autocall code will prefix such +variables with C<&> when calling the wrapped C library function. + +=item * + +An output part of C causes the corresponding passed argument to be +I with the value of the variable. + +=item * + +An output part of C causes the value of the variable to be +returned as an extra SV after the C value, if any. They are +returned in the order they appear in the XSUB's parameter list. + +=item * + +For the specific case of the modifier being C, it is a +pseudo-parameter, and I. It doesn't form part of +the signature of the XSUB, although it I used for any autocall. So for +example: + + int + foo(int a, OUTLIST int b, int c) + +gets converted to roughly this C code: + + if (items != 2) + croak_xs_usage(cv, "a, c"); + { + int RETVAL; + int a = (int)SvIV(ST(0)); + int b; + int c = (int)SvIV(ST(1)); + RETVAL = foo(a, &b, c); + } + ... push the values of RETVAL and b onto the stack and return ... + +=back + +The approximate perl equivalents for these modifiers are given in the +examples below, where the perl code C stands in for the C +autocall C. + + + int i sub foo { + IN int i my $i = $_[N]; + real_foo($i); + } + + IN_OUT int i sub foo { + my $i = $_[N]; + real_foo(\$i); + $_[N] = $i; + } + + IN_OUTLIST int i sub foo { + my $i = $_[N]; + real_foo(\$i); + return ..., $i, ...; + } + + OUT int i sub foo { + my $i; + real_foo(\$i); + $_[N] = $i; + + OUTLIST int i sub foo { + my $i; # NB $_[N] is not consumed + real_foo(\$i); + return ..., $i, ...; + } + +Together, they allow you to wrap C functions which use pointers to return +extra values; either preserving the C-ish API in perl (C), or +providing a more Perl-like API (C). For example, wrapping the +C function from the example above could be done using +C: + + void + parse_time(int time, \ + OUT int hour, OUT int min, OUT int sec) + +which could be called from perl as: + + my ($hour, $min, $sec); + # set ($hour, $min, $sec) to (23,59,59): + parse_time(86399, $hour, $min, $sec); + +Or by using C: + + void + parse_time(int time, \ + OUTLIST int hour, OUTLIST int min, OUTLIST int sec) + +which could be called from perl as: + + # set ($hour, $min, $sec) to (23,59,59): + my ($hour, $min, $sec) = parse_time(86399); + + +=head3 Default Parameter Values + + int + foo(int i, char *s = "abc") + + int + bar(int i, int j = i + ')', char *s = "abc,)") + + int + baz(int i, char *s = NO_INIT) + +Optional parameters can be indicated by appending C<= C_expression> to the +parameter declaration. The C expression will be evaluated if not enough +arguments are supplied. Parameters with default values should come after +any mandatory parameters (although this is currently not enforced by the +XS compiler). The value can be any valid compile-time or run-time C +expression (but see below), including the values of any parameters +declared to its left. The special value C indicates that the +parameter is kept uninitialised if there isn't a corresponding argument. + +The XS parser's handling of default expressions is rather simplistic. It +just wants to extract parameter declarations (including any optional +trailing default value) from a comma-separated list, but it doesn't +understand C syntax. It can handle commas and closing parentheses within a +quoted string, but currently not an escaped quote such as C<'\''> or +C<"\"">. Neither can it handle balanced parentheses such as C. + +Due to an implementation flaw, default value expressions are currently +evalled in double-quoted context during parsing, in a similar fashion to +typemap templates. So for example C is expanded to +C or similar. This behaviour may be fixed at some +point; in the meantime, it is best to avoid the C<$> and C<@> characters +within default value expressions. + +=head3 The C pseudo-parameter + + int + foo(char *s, int length(s)) + +It is common for a C function to take a string pointer and length as two +arguments, while in Perl, string-valued SVs combine both the string and +length in a single value. To simplify generating the autocall code in such +situations, the C pseudo-parameter acts as the length of the +parameter C. It doesn't consume an argument or appear in the XSUB's +usage message, but it I passed to the autocalled C function. For +example, this XS: + + void + foo(char *s, short length(s), int t) + +translates to something similar to this C code: + + if (items != 2) + croak_xs_usage(cv, "s, t"); + + { + STRLEN STRLEN_length_of_s; + short XSauto_length_of_s; + char * s = (char *)SvPV(ST(0), STRLEN_length_of_s); + int t = (int)SvIV(ST(1)); + + XSauto_length_of_s = STRLEN_length_of_s; + foo(s, XSauto_length_of_s, t); + } + +and might be called from Perl as: + + foo("abcd", 9999); + +The exact C code generated will vary over releases, but the important +things to note are: + +=over + +=item * + +The auto variable C will be declared with the +specified type and will be passed to any autocall function, but it won't +appear in the usage message. This variable is available for use in C +blocks and similar. + +=item * + +The auto variable C is used in addition to allow +conversion between the type expected by C and the type declared +for the length pseudo-parameter. + +=item * + +A length parameter can appear anywhere in the signature, even before the +string parameter of the same name; but its position in any autocall +matches its position in the signature. + +=item * + +Each length parameter must match another parameter of the same name. That +parameter must be a string type (something which maps to the C +typemap type). + +=back + +=head3 Ellipsis: variable-length parameter lists + + int + foo(char *s, ...) + +An XSUB can have a variable-length parameter list by specifying an +ellipsis as the last parameter, similar to C function declarations. Its +main effect is to disable the error check for too many parameters. Any +declared parameters will still be processed as normal, but the programmer +will have to access any extra arguments manually, making use of the +C macro to access the nth item on the stack (counting from 0), and +the C variable, which indicates the total number of passed +arguments, including any fixed arguments. + +Note that currently XS doesn't provide any mechanism to autocall +variable-length C functions, so the ellipsis should only be used on XSUBs +which have a body. + +For example, consider this Perl subroutine which returns the sum of all +of its arguments which are within a specified range: + + sub minmax_sum { + my $min = shift; + my $max = shift; + my $RETVAL = 0; + $RETVAL += $_ for grep { $min <= $_ && $_ <= $max } @_; + return $RETVAL; + } + +This XSUB provides equivalent functionality: + + int + minmax_sum(int min, int max, ...) + CODE: + { + int i = 2; /* skip the two fixed arguments */ + RETVAL = 0; + + for (; i < items; i++) { + int val = (int)SvIV(ST(i)); + if (min <= val && val <= max) + RETVAL += val; + } + } + OUTPUT: + RETVAL + +It is possible to write an XSUB which both accepts and returns a list. For +example, this XSUB does the equivalent of the Perl C + + void + triple(...) + PPCODE: + SP += items; + { + int i; + for (i = 0; i < items; i++) { + int val = (int)SvIV(ST(i)); + ST(i) = sv_2mortal(newSViv(val*3)); + } + } + +Note that the L keyword, in comparison to +C, resets the local copy of the argument stack pointer, and relies +on the coder to place any return values on the stack. The example above +reclaims the passed arguments by setting C back to the top of the +stack, then replaces the items on the stack one by one. + +=head2 The XSUB Input Part + +Following an XSUB's declaration part, the body of the XSUB follows. The +first part of the body is the I part, and is mainly concerned with +declaring auto variables and assigning to them values extracted from the +passed parameters. The two main keywords associated with this activity are +L and L. The +first allows you to inject extra variable declaration lines, while the +latter used to be needed to specify the type of each parameter, but is now +mainly of historical interest. This is also the place for the rarely-used +L keyword. + +Note that the keywords described in L and L may also appear in this part, plus the L and L keywords. + +=head3 The PREINIT: Keyword + + PREINIT: + int i; + char *prog_name = get_prog_name(); + +This keyword allows extra variables to be declared and possibly +initialised immediately before the declarations of auto variables +generated from any parameter declarations or C lines. Any lines +following C until the next keyword (except POD and XS comments) +are copied out as-is to the C code file. Multiple C keywords are +allowed. + +It is sometimes needed because in traditional C, all variable +declarations must come before any statements. While this is no longer a +restriction in the perl interpreter source since 5.36.0, the C compiler +flags used when compiling XS code may be different and so, depending on +the compiler, it may still be necessary to preserve the correct ordering. + +Any variable declarations generated by C and lines from C +are output in the same order they appear in the XS source, followed by any +variable declarations generated from the XSUB's parameter declarations. +These may be followed by statements to initialise those those variables. +Thus, any variable declarations in a later C or C block may be +flagged as a declaration-after-statement. + +C code shouldn't assume that any variables declared earlier have +already been initialised; initialisation is deferred if the initialisation +code (typically obtained from a typemap) isn't of the simple C form, or has a default value. + +For example: + + void + foo(int i = 0) + PREINIT: + int j = 1; + CODE: + bar(i, j); + +might be translated into C code similar to: + + { + int j = 1; + int i; + + if (items < 1) + i = 0; + else { + i = (int)SvIV(ST(0)); + } + bar(i, j); + } + +Usually you could dispense with C by just wrapping the code in +C blocks in braces, but it may be necessary if the ordering of the +variable initialisations is sensitive, e.g. if it affected by some +changing global state. + +=head3 The INPUT: Keyword + + void + foo(a, b, c, d, e, int f) + # implicit INPUT section + int a + # explicit INPUT section + INPUT: + long &b + int c = ($type)MySvIV($arg) + int d = NO_INIT + int e + if (some_condition) { $var += 1 } + ... + +Immediately following an XSUB's declaration, there is an implicit C +section, i.e. the parser behaves as if there was a literal "C +line injected before the first line of the body. This can be followed by +zero or more explicit C sections, possibly interleaved with other +keywords and sections such as C. + +When XS was first created, it was modelled on the syntax of pre-ANSI C, +which required the types of parameters to be separately specified. It was +later updated to allow parameter types to be specified in the parameter +list, like ANSI C. Thus there is rarely any good reason to use C +sections now; but you will often encounter them in older code. + +Each C line, at a minimum, specifies the type of a parameter listed +in the XSUB's signature, e.g. + + char *s + +In addition, the variable name may be prefixed with C<&> to indicate that +a pointer to the variable should be passed to any autocall function; and +may have a postfix initialisation modifier starting with one of the three +characters C<= + ;>. + +Note that if a variable name doesn't match any of the declared parameters, +then it I be treated as an auto variable declaration (depending on +the perl version and on whether it has an initialisation override). This +misfeature may be deprecated at some point in the future, so don't rely on +it: use a C section if necessary. These two examples are mostly +equivalent, with the first form being preferred: + + void + foo(int a) + PREINIT: + short b = 1; + + void + foo(a) + int a + short b = 1; + +=head4 The & variable modifier in INPUT + +The C<&> variable modifier has the single effect that the corresponding +argument passed to an autocall function will have the variable name +prefixed with C<&>. Combined with C, this allows the +I
of a variable to be passed to a wrapped function, which updates +that variable's value; on return, the XSUB updates the caller's arg with +that value. The modern equivalent of this is to declare the parameter as +C. These two XSUBs are equivalent: + + void + foo(IN_OUT int i) + + void + foo(i) + int &i + OUTPUT: + i + +and they both wrap a C function called foo() that takes a single C +argument which (presumably) updates the integer pointed to. They both +generate C code similar to this: + + int i = (int)SvIV(ST(0)); + foo(&i); + sv_setiv(ST(0), (IV)i); + + +=head4 Altering variable initialisation in INPUT + +Normally each declared parameter causes a C auto variable of the same name +to be declared, and for code to be planted which initialises that variable +to the value of the corresponding passed argument. The initialisation code +is usually obtained by expanding the typemap template corresponding to +the parameter's type. It is possible to override, augment, or skip that +initialisation code, by appending one of the three characters C<= + ;> and +an initialiser expression, to the C line. + + void + foo(a,b,c,d,e,f,g) + + # Use the standard typemap entry: + int a + # and with optional trailing colon + int b; + + # Override the typemap entry: + int c = ($type)MySvIV($arg) + + # Skip the initialisation entirely: + int d = NO_INIT + int e ; NO_INIT + + # Add deferred initialisation code + # *in addition* to the standard init: + int f + if (some_condition) { $var += 1 } + + # Add deferred initialisation code + # *instead of* the standard init: + int g ; if (some_condition) { $var += 1 } + +Any override code is passed through template expansion in the same way +that typemap templates are, with C<$var>, C<$arg>, C<$type> etc being +expanded. Deferred initialisation code is placed after all variable +declarations. + +In modern XS where C is not often used, some of these initialiser +effects can be achieved in other ways: + +=over + +=item * + +an overridden typemap entry could be specified by using C +to add a template for the type of this variable; + +=item * + +skipping initialisation can be achieved using the C and C +parameter declaration modifiers; + +=item * + +adding deferred initialisation code may be achievable via C or +C blocks. + +=back + +=head3 The SCOPE: Keyword and typemap entry + + # XSUB-scoped + void + foo(int i) + SCOPE: ENABLE + CODE: + ... + + # file-scoped + SCOPE: ENABLE + void + bar(int i) + CODE: + ... + + # typemap entry + TYPEMAP: < keyword can be used to enable scoping for a particular XSUB +(disabled by default). Its effect is to wrap the main body of the XSUB +(including most parameter and return value processing) within an C<{ +ENTER;> and C pair. This has the effect of clearing any +accumulated savestack entries at the end of the code body. If disabled, +then the savestack will usually be cleared by the caller anyway, so this +is a rarely-used keyword. + +The SCOPE keyword may be either XSUB-scoped or file-scoped (this refers to +the scope of the keyword within the XS file, not to the scope generated by +the keyword). For the first, it may appear anywhere in the input part or +the XSUB. For the latter, it may appear anywhere in file scope, but due to +a long-standing parser bug, the keyword's state is reset at the start of +each XSUB, so it will only have any effect if appears just before a XSUB +declaration and as part of the same paragraph (i.e. with no intervening +blank lines), such as in the example above. It will only affect the single +following XSUB. + +The XSUB-scoped form has been available since perl-5.004, but was broken +by perl-5.12.0 (F v2.21) and fixed in perl-5.44.0 (F +v3.58). The file-scoped form has been available since perl-5.12.0 . + +To support potentially complex type mappings, if an C typemap entry +contains a code comment like C, then scoping will be +automatically enabled for any XSUB which uses that typemap entry. This +currently only works for parameters whose type is specified using +old-style C lines rather than an ANSI-style declaration, i.e. not +for C. In fact, the XS parser, when looking for a SCOPE +comment in a typemap, is currently very lax: it's actually a +case-insensitive match of any code comment which contains the text "scope" +plus anything else. But you shouldn't rely on this; always use the form +shown here. Even better, just don't use it at all. + +=head2 The XSUB Init Part + +Following an XSUB's input part, an optional init part follows. This +consists solely of the C keyword described below, plus the keywords +described in L and L, plus the +L, L and +L keywords. + +=head3 The INIT: Keyword + +The C keyword allows arbitrary initialisation code to inserted after +any variable declarations (and their initialisations), but before the main +body of code. It is primarily intended for use when the main body is an +autocall to a C function. For example these two XSUBs are equivalent: -=head2 The Anatomy of an XSUB + int + foo(int i) + INIT: + if (i < 0) + XSRETURN_UNDEF; -The simplest XSUBs consist of 3 parts: a description of the return -value, the name of the XSUB routine and the names of its arguments, -and a description of types or formats of the arguments. + int + foo(int i) + CODE: + if (i < 0) + XSRETURN_UNDEF; + RETVAL = foo(i); + OUTPUT: + RETVAL -The following XSUB allows a Perl program to access a C library function -called sin(). The XSUB will imitate the C function which takes a single -argument and returns a single value. +Any lines following C until the next keyword (except POD and XS +comments) are copied out as-is to the C code file. Multiple C +keywords are allowed. - double - sin(x) - double x +=head2 The XSUB Code Part -Optionally, one can merge the description of types and the list of -argument names, rewriting this as +Following an XSUB's optional init part, an optional code part follows. This +consists mainly of the C or C keywords, which provide the +code block for the main body of the XSUB. These two keywords are similar, +except that C can be thought of as acting at a lower level; it +resets the stack pointer to the base of the stack frame and then relies on +the programmer to push any return values; whereas C will (with +prompting) automatically generate code to return the value of C. - double - sin(double x) +There is also a rarely-used C keyword which generates +a body which croaks. -This makes this XSUB look similar to an ANSI C declaration. An optional -semicolon is allowed after the argument list, as in +Only one of these keywords may appear in this part, and at most once; and +no other keywords are recognised in this part (although such keywords +could be instead be processed in the tail or head of the preceding and +following init and output parts). - double - sin(double x); +In the absence of any of those three keywords, the XS compiler will +generate an autocall: a call to the C function of the same name as the +XSUB. -Parameters with C pointer types can have different semantic: C functions -with similar declarations - - bool string_looks_as_a_number(char *s); - bool make_char_uppercase(char *c); - -are used in absolutely incompatible manner. Parameters to these functions -could be described to B like this: - - char * s - char &c - -Both these XS declarations correspond to the C C type, but they have -different semantics, see L<"The & Unary Operator">. - -It is convenient to think that the indirection operator -C<*> should be considered as a part of the type and the address operator C<&> -should be considered part of the variable. See L -for more info about handling qualifiers and unary operators in C types. - -The function name and the return type must be placed on -separate lines and should be flush left-adjusted. - - INCORRECT CORRECT - - double sin(x) double - double x sin(x) - double x - -The rest of the function description may be indented or left-adjusted. The -following example shows a function with its body left-adjusted. Most -examples in this document will indent the body for better readability. - - CORRECT - - double - sin(x) - double x - -More complicated XSUBs may contain many other sections. Each section of -an XSUB starts with the corresponding keyword, such as INIT: or CLEANUP:. -However, the first two lines of an XSUB always contain the same data: -descriptions of the return type and the names of the function and its -parameters. Whatever immediately follows these is considered to be -an INPUT: section unless explicitly marked with another keyword. -(See L.) - -An XSUB section continues until another section-start keyword is found. - -=head2 The Argument Stack - -The Perl argument stack is used to store the values which are -sent as parameters to the XSUB and to store the XSUB's -return value(s). In reality all Perl functions (including non-XSUB -ones) keep their values on this stack all the same time, each limited -to its own range of positions on the stack. In this document the -first position on that stack which belongs to the active -function will be referred to as position 0 for that function. - -XSUBs refer to their stack arguments with the macro B, where I -refers to a position in this XSUB's part of the stack. Position 0 for that -function would be known to the XSUB as ST(0). The XSUB's incoming -parameters and outgoing return values always begin at ST(0). For many -simple cases the B compiler will generate the code necessary to -handle the argument stack by embedding code fragments found in the -typemaps. In more complex cases the programmer must supply the code. - -=head2 The RETVAL Variable - -The RETVAL variable is a special C variable that is declared automatically -for you. The C type of RETVAL matches the return type of the C library -function. The B compiler will declare this variable in each XSUB -with non-C return type. By default the generated C function -will use RETVAL to hold the return value of the C library function being -called. In simple cases the value of RETVAL will be placed in ST(0) of -the argument stack where it can be received by Perl as the return value -of the XSUB. - -If the XSUB has a return type of C then the compiler will -not declare a RETVAL variable for that function. When using -a PPCODE: section no manipulation of the RETVAL variable is required, the -section may use direct stack manipulation to place output values on the stack. - -If PPCODE: directive is not used, C return value should be used -only for subroutines which do not return a value, I CODE: -directive is used which sets ST(0) explicitly. - -Older versions of this document recommended to use C return -value in such cases. It was discovered that this could lead to -segfaults in cases when XSUB was I C. This practice is -now deprecated, and may be not supported at some future version. Use -the return value C in such cases. (Currently C contains -some heuristic code which tries to disambiguate between "truly-void" -and "old-practice-declared-as-void" functions. Hence your code is at -mercy of this heuristics unless you use C as return value.) - -=head2 Returning SVs, AVs and HVs through RETVAL - -When you're using RETVAL to return an C, there's some magic -going on behind the scenes that should be mentioned. When you're -manipulating the argument stack using the ST(x) macro, for example, -you usually have to pay special attention to reference counts. (For -more about reference counts, see L.) To make your life -easier, the typemap file automatically makes C mortal when -you're returning an C. Thus, the following two XSUBs are more -or less equivalent: - - void - alpha() - PPCODE: - ST(0) = newSVpv("Hello World",0); - sv_2mortal(ST(0)); - XSRETURN(1); +=head3 Auto-calling a C function + +In the absence of any explicit main body code via C or C, +the XS parser will generate a body for you automatically (this is referred +to as C in this document). In its most basic form, the parser +assumes that the XSUB will be a simple wrapper for a C function of the +same name, with the same parameters and return type as the XSUB. So for +example, these two XSUB definitions are equivalent, but the first is an +autocall with less boilerplate needed: + + int + foo(char *s, short flags) - SV * - beta() + int + foo(char *s, short flags) CODE: - RETVAL = newSVpv("Hello World",0); + RETVAL = foo(s, flags); OUTPUT: - RETVAL + RETVAL + +Note that the XSUB C function and the wrapped C function are two +different entities; the first will have a name like C; +when Perl code calls the 'Perl' function C, behind the +scenes the Perl interpreter calls C, which extracts the +string and short int values from the two passed argument SVs, calls +C, then stuffs its return value into an SV and returns that to the +Perl caller. + +The two basic types of generated autocall code are: + + foo(a, b, c); + + RETVAL = foo(a, b, c); + +depending on whether the XSUB is declared C or not. The variables +passed to the function are usually just the names of the XSUB's +parameters, in the same order. Parameters with default values are +included, while ellipses are ignored. So for example + + int + foo(int a, int b = 0, ...) + +generates this autocall code: + + RETVAL = foo(a, b); + +There are various keywords which can be used to modify the basic behaviour +of an autocall. + +=over + +=item * + +The L keyword, which allows wrapped C functions +which share a common prefix in their names to be mapped to perl functions +whose names don't have that prefix. + +=item * + +The +L +etc parameter modifiers, which cause that parameter to be passed to the +autocalled function with a C<&> prefix, on the assumption that the +wrapped function expects a pointer and will update the location pointed +to. -This is quite useful as it usually improves readability. While -this works fine for an C, it's unfortunately not as easy -to have C or C as a return value. You I be -able to write: +=item * - AV * - array() +The L pseudo-parameter"> +pseudo-parameter, which allows the length of another parameter to be +passed as a separate argument to the wrapped function, even though it +isn't a parameter of the Perl function. + +=item * + +The L keyword, which allows the arguments +passed to the wrapped function to be completely overridden: handy when +arguments need to be skipped or reordered compared with the perl +function. + +=item * + +The L keyword, which allows code to be added +directly before the autocall. + +=item * + +The L keyword, which allows code to be +added directly after the autocall. + +=item * + +Support for L XSUBs, which can (among other +things) modify the autocall into a C++ method call, e.g. +C<< THIS->foo(s,flags) >>. + +=back + + +=head4 The C_ARGS: Keyword + + void foo1(int a, int b, int c) + C_ARGS: b, a + + void foo2(int a, int b) + C_ARGS: a < 0 ? 0 : a, + b, + 0 + +Normally the arguments for an autocall are generated automatically, based +on the XSUB's parameter declarations. The C keyword allows you to +override this and manually specify the text that will be placed between +the parentheses in the autocall. This is useful when the ordering and +nature of parameters varies between Perl and C, without a need to write a +C or C section. + +The C section consists of all lines of text until the next keyword +or to the end of the XSUB, and is used without modification (except that +any POD or XS comments will be stripped). + +=head3 The CODE: Keyword + + int + abs_double(int i) CODE: - RETVAL = newAV(); - /* do something with RETVAL */ + if (i < 0) + i = -i; + RETVAL = i * 2; OUTPUT: - RETVAL - -But due to an unfixable bug (fixing it would break lots of existing -CPAN modules) in the typemap file, the reference count of the C -is not properly decremented. Thus, the above XSUB would leak memory -whenever it is being called. The same problem exists for C, -C, and C (which indicates a scalar reference, not -a general C). -In XS code on perls starting with perl 5.16, you can override the -typemaps for any of these types with a version that has proper -handling of refcounts. In your C section, do - - AV* T_AVREF_REFCOUNT_FIXED - -to get the repaired variant. For backward compatibility with older -versions of perl, you can instead decrement the reference count -manually when you're returning one of the aforementioned -types using C: - - AV * - array() - CODE: - RETVAL = newAV(); - sv_2mortal((SV*)RETVAL); - /* do something with RETVAL */ + RETVAL + +The C keyword is the usual mechanism for providing your own code as +the main body of the XSUB. It is typically used when the XSUB, rather than +wrapping a library function, is providing general functionality which can +be more easily or efficiently implemented in C than in Perl. +Alternatively, it can still be used to wrap a library function for cases +which are too complex for autocall to handle. + +Note that on entry to the C block of code, the values of any passed +arguments will have been assigned to auto variables, but the original SVs +will still be on the stack and accessible via C if necessary. + +Similarly to autocall XSUBs, a C variable is declared if the +return value of the XSUB is not C. Unlike autocall, you have to +explicitly tell the XS compiler to generate code to return the value of +C, by using the The L keyword. +(Requiring this was probably a bad design decision, but we're stuck with +it now.) Newer XS parsers will warn if C is seen in the C +section without a corresponding C section. + +A C XSUB will typically return just the C value (or possibly +more items with the C parameter modifiers). To take complete +control over returning values, you can use the C keyword instead. +Note that it is possible for a C section to do this too, by doing its +own stack manipulation and then doing an C to return directly +while indicating that there are C items on the stack. This bypasses the +normal C etc that the XS parser will have planted after the +C lines. But it is usually cleaner to use C instead. + +Any lines following C until the next keyword (except POD and XS +comments) are copied out as-is to the C code file. Multiple C +keywords are not allowed. + +=head3 The PPCODE: Keyword + + # XS equivalent of: sub one_to_n { my $n = $_[0]; 1..$n } + + void + one_to_n(int n) + PPCODE: + { + int i; + if (n < 1) + Perl_croak_nocontext( + "one_to_n(): argument %d must be >= 1", n); + EXTEND(SP, n); + for (i = 1; i <= n; i++) + mPUSHi(i); + } + +The C keyword is similar to the C keyword, except that on +entry it resets the stack pointer to the base of the current stack frame, +and it doesn't generate any code to return C or similar: pushing +return values onto the stack is left to the programmer. In this way it can +be viewed as a lower-level alternative to C, when you want to take +full control of manipulating the argument stack. The "PP" in its name +stands for "PUSH/PULL", reflecting the low-level stack manipulation. +C is typically used when you want to return several values or even +an arbitrary list, compared with C, which normally returns just the +value of C. + +The C keyword must be the last keyword in the XSUB. Any lines +following C until the end of the XSUB (except POD and XS comments) +are copied out as-is to the C code file. Multiple C keywords are +not allowed. + +Typically you declare a C XSUB with a return type of C; any +other return type will cause a C auto variable of that type to be +declared, which will be otherwise unused. + +On entry to the C block of code, the values of any declared +parameters arguments will have already been assigned to auto variables, +but the original SVs will still be on the stack and initially accessible +via C if necessary. But the default assumption for a C +block is that you have already finished processing any supplied arguments, +and that you want to push a number of return values onto the stack. The +simple C example shown above is based on that assumption. But +more complex strategies are possible. + +There are basically two ways to access and manipulate the stack in a +C block. First, by using the C macro, to get, modify, or +replace the Ith item in the current stack frame, and secondly to push +(usually temporary) return values onto the stack. The first uses the +hidden C variable, which is set on entry to the XSUB, and is the index +of the base of the current stack frame. This remains unchanged throughout +execution of the XSUB. The second approach uses the local stack pointer, +C (more on that below), which on entry to the C block points +to the base of the stack frame. Macros like C store a temporary +SV at that location, then increment C. On return from a C +XSUB, the current value of C is used to indicate to the caller how +many values are being returned. + +In general these two ways of accessing the stack should not be mixed, or +confusion is likely to arise. The PUSH strategy is most useful when you +have no further use for the passed arguments, and just want to generate +and return a list of values, as in the C example above. The +C strategy is better when you still need to access the passed +arguments. In the example below, + + # XS equivalent of: sub triple { map { $_ * 3} @_ } + + void + triple(...) + PPCODE: + SP += items; + { + int i; + for (i = 0; i < items; i++) { + int val = (int)SvIV(ST(i)); + ST(i) = sv_2mortal(newSViv(val*3)); + } + } + +C is first incremented to reclaim the passed arguments which are still +on the stack; then one by one, each passed argument is retrieved, and then +each stack slot is replaced with a new mortal value. When the loop is +finished, the current stack frame contains a list of mortals, which is +then returned to the caller, with C indicating how many items are +returned. + +Before pushing return values onto the stack (or storing values at C +locations higher than the number of passed arguments), it is necessary to +ensure there is sufficient space on the stack. This can be achieved either +through the C macro as shown in the C example +above, or by using the 'X' variants of the push macros, such as +C, which can be used to check and extend the stack by one each +time. Doing a single C in advance is more efficient. C +will ensure that there is at least enough space on the stack for n further +items to be pushed. + +If using the PUSH strategy, it is useful to understand in more detail how +pushing and the local stack pointer, C are implemented. The generated +C file will have access to (among others) the following macro definitions +or similar: + + #define dSP SV **sp = PL_stack_sp + #define SP sp + #define PUSHs(s) *++sp = (s) + #define mPUSHi(i) sv_setiv(PUSHs(sv_newmortal()), (IV)(i)) + #define PUTBACK PL_stack_sp = sp + #define SPAGAIN sp = PL_stack_sp + #define dXSARGS dSP; .... + +The global (or per-interpreter) variable C is a pointer to +the current top-most entry on the stack, equal initially to +C<&ST(items-1)>. On entry to the XSUB, the C at its top will +cause the C variable to be declared and initialised. This becomes a +I copy of the argument stack pointer. The standard stack +manipulation macros such as C all use this local copy. + +The XS parser will usually emit two lines of C code similar to these +around the PP code block lines: + + SP -= items; + ... PP lines ... + PUTBACK; return; + +This has the effect of resetting the local copy of the stack pointer (but +I the stack pointer itself) back to the base of the current stack +frame, discarding any passed arguments. The original arguments are still +on the stack. C etc will, starting at the base of the stack +frame, progressively overwrite any original arguments. Finally, the +C sets the real stack pointer to the copy, making the changes +permanent, and also allowing the caller to determine how many arguments +were returned. + +Any functions called from the XSUB will only see the value of +C and not C. So when calling out to a function which +manipulates the stack, you may need to resynchronise the two; for example: + + PUTBACK; + push_contents_of_array(av); + SPAGAIN; + +The C and C macros will update both +C and C if the extending causes the stack to be +reallocated. + +Note that there are several C macros, which generally create a +temporary SV, set its value to the argument, and push it onto the stack. +These are: + + mPUSHs(sv) mortalise and push an SV + mPUSHi(iv) create+push mortal and set to the integer val + mPUSHu(uv) create+push mortal and set to the unsigned val + mPUSHn(n) create+push mortal and set to the num (float) val + mPUSHp(str, len) create+push mortal and set to the string+length + mPUSHpvs("string") create+push mortal and set to the literal string + (perl 5.38.0 onwards) + +=head3 The NOT_IMPLEMENTED_YET: Keyword + + void + foo(int a) + NOT_IMPLEMENTED_YET: + +This keyword, as a fourth alternative to C, C and autocall, +generates a main body for the XSUB consisting solely of the C code: + + Perl_croak(aTHX_ "Foo::Bar::foo: not implemented yet"); + +The current implementation is quite buggy in terms of parsing and where +the keyword can appear within an XSUB, so it's generally better to avoid +it. It is documented here for completeness. + +=head2 The XSUB Output Part + +Following an XSUB's code part, any results may be post-processed and +returned. Two keywords in particular support this: L, which allows for a block of code to be added after +any autocall in order to post-process return values from the call, and +L, which tells the parser to generate code to +return the value of C or to update the values of one or more +passed arguments. + +These two optional keywords should each only be used once at most, and in +that order; but due to a parsing bug (kept for backwards compatibility), +they can appear in either order any number of times. But don't do that. + +Note that the keywords described in L and L may also appear in this part. + +=head3 The POSTCALL: Keyword + +The C keyword allows a block of code to be inserted directly after +any autocall or C/C code block (although it's really only of +use with autocall). It's typically used for cleaning up the return value +from the autocall. For example these two XSUBs are equivalent: + + int + foo(int a) + POSTCALL: + if (RETVAL < 0) + RETVAL = 0 + + int + foo(int a) + CODE: + RETVAL = foo(a); + if (RETVAL < 0) + RETVAL = 0 OUTPUT: - RETVAL + RETVAL -Remember that you don't have to do this for an C. The reference -documentation for all core typemaps can be found in L. +=head3 The OUTPUT: Keyword -=head2 The MODULE Keyword + # Common usage: -The MODULE keyword is used to start the XS code and to specify the package -of the functions which are being defined. All text preceding the first -MODULE keyword is considered C code and is passed through to the output with -POD stripped, but otherwise untouched. Every XS module will have a -bootstrap function which is used to hook the XSUBs into Perl. The package -name of this bootstrap function will match the value of the last MODULE -statement in the XS source files. The value of MODULE should always remain -constant within the same XS file, though this is not required. + OUTPUT: + RETVAL -The following example will start the XS code and will place -all functions in a package named RPC. + # Rare usage: - MODULE = RPC + OUTPUT: + arg0 + SETMAGIC: DISABLE + arg1 + SETMAGIC: ENABLE + arg2 sv_setfoo(ST[2], arg2) -=head2 The PACKAGE Keyword -When functions within an XS source file must be separated into packages -the PACKAGE keyword should be used. This keyword is used with the MODULE -keyword and must follow immediately after it when used. +The C keyword can be used to indicate that the value of RETVAL +should be returned to the caller on the stack, and/or that the values of +certain passed Perl arguments should be updated with the current values of +the corresponding parameter variables. Each non-blank line of the +C block should contain the name of one variable, with optional +setting code, or a C keyword with a value of C or +C. - MODULE = RPC PACKAGE = RPC +The common usage is to list just the C variable: - [ XS code in package RPC ] + int + foo() + CODE: + RETVAL = ...; + OUTPUT: + RETVAL - MODULE = RPC PACKAGE = RPCB +It is needed for XSUBs containing a C block to tell the XS compiler +to generate C code which will return the value of C to the caller. +For autocall XSUBs, this is done automatically without the need for the +C keyword. - [ XS code in package RPCB ] +The second usage of C is to specify parameters to be updated; this +usage has been almost completely replaced by using the +L +parameter modifier. For example these two XSUBs have identical behaviours, +but the second is the preferred form: - MODULE = RPC PACKAGE = RPC + int + foo1(a) + INPUT: + int &a + OUTPUT: + a - [ XS code in package RPC ] + int + foo2(IN_OUT int a) -The same package name can be used more than once, allowing for -non-contiguous code. This is useful if you have a stronger ordering -principle than package names. +They both cause output C code similar to this to be planted (with the +first part derived from a typemap): -Although this keyword is optional and in some cases provides redundant -information it should always be used. This keyword will ensure that the -XSUBs appear in the desired package. + sv_setiv(ST(0), (IV)a); + SvSETMAGIC(ST(0)); -=head2 The PREFIX Keyword +which updates the value of the passed SV with the current value of C, +and then calls the SV's I magic, if any: which will, for example, +cause a tied variable to have its C method called. -The PREFIX keyword designates prefixes which should be -removed from the Perl function names. If the C function is -C and the PREFIX value is C then Perl will -see this function as C. +You can skip the planting of the C magic call with +C; in the example at the start of this section, C +and C will have set magic, while C won't. The C +setting remains in force until another C, or notionally until +the end of the current C block. In fact the current setting will +carry over into any further C declarations within in the same +XSUB, or since Perl 5.40.0, only into any declarations within the same +case C branch. -This keyword should follow the PACKAGE keyword when used. -If PACKAGE is not used then PREFIX should follow the MODULE -keyword. +The current setting of C is ignored for C, which is +usually setting the value of a fresh temporary SV which won't have any +attached magic anyway. - MODULE = RPC PREFIX = rpc_ +Finally, it is possible to override the typemap entry used to set the +value of the temporary SV or passed argument from the C or other +variables. Normally, in an XSUB like: - MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_ + int + foo(int abc) + OUTPUT: + abc -=head2 The OUTPUT: Keyword -The OUTPUT: keyword indicates that certain function parameters should be -updated (new values made visible to Perl) when the XSUB terminates or that -certain values should be returned to the calling Perl function. For -simple functions which have no CODE: or PPCODE: section, -such as the sin() function above, the RETVAL variable is -automatically designated as an output value. For more complex functions -the B compiler will need help to determine which variables are output -variables. +the C type (via a two-stage lookup in the system typemap) will yield +this output typemap entry: -This keyword will normally be used to complement the CODE: keyword. -The RETVAL variable is not recognized as an output variable when the -CODE: keyword is present. The OUTPUT: keyword is used in this -situation to tell the compiler that RETVAL really is an output -variable. + sv_setiv($arg, (IV)$var); -The OUTPUT: keyword can also be used to indicate that function parameters -are output variables. This may be necessary when a parameter has been -modified within the function and the programmer would like the update to -be seen by Perl. +which, after variable expansion, may yield - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep + sv_setiv(ST(0), (IV)abc); -The OUTPUT: keyword will also allow an output parameter to -be mapped to a matching piece of code rather than to a -typemap. +or similar. This can be overridden; for example - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep sv_setnv(ST(1), (double)timep); - -B emits an automatic C for all parameters in the -OUTPUT section of the XSUB, except RETVAL. This is the usually desired -behavior, as it takes care of properly invoking 'set' magic on output -parameters (needed for hash or array element parameters that must be -created if they didn't exist). If for some reason, this behavior is -not desired, the OUTPUT section may contain a C line -to disable it for the remainder of the parameters in the OUTPUT section. -Likewise, C can be used to reenable it for the -remainder of the OUTPUT section. See L for more details -about 'set' magic. - -=head2 The NO_OUTPUT Keyword - -The NO_OUTPUT can be placed as the first token of the XSUB. This keyword -indicates that while the C subroutine we provide an interface to has -a non-C return type, the return value of this C subroutine should not -be returned from the generated Perl subroutine. - -With this keyword present L is created, and in the -generated call to the subroutine this variable is assigned to, but the value -of this variable is not going to be used in the auto-generated code. - -This keyword makes sense only if C is going to be accessed by the -user-supplied code. It is especially useful to make a function interface -more Perl-like, especially when the C return value is just an error condition -indicator. For example, - - NO_OUTPUT int - delete_file(char *name) - POSTCALL: - if (RETVAL != 0) - croak("Error %d while deleting file '%s'", RETVAL, name); - -Here the generated XS function returns nothing on success, and will die() -with a meaningful error message on error. - -=head2 The CODE: Keyword - -This keyword is used in more complicated XSUBs which require -special handling for the C function. The RETVAL variable is -still declared, but it will not be returned unless it is specified -in the OUTPUT: section. - -The following XSUB is for a C function which requires special handling of -its parameters. The Perl usage is given first. - - $status = rpcb_gettime( "localhost", $timep ); - -The XSUB follows. - - bool_t - rpcb_gettime(host,timep) - char *host - time_t timep - CODE: - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL + int + foo(int abc) + OUTPUT: + abc my_setiv(ST(0), (IV)abc); -=head2 The INIT: Keyword +But importantly, unlike the similar syntax in C lines, the override +text is I variable expanded. It is thus tricky to ensure that the +right arguments are used (such as C). Basically this feature has a +design flaw and should probably be avoided. Since 5.16.0 it's been +possible to have locally defined typemaps using the L keyword which is probably a better way to modify how +values are returned. -The INIT: keyword allows initialization to be inserted into the XSUB before -the compiler generates the call to the C function. Unlike the CODE: keyword -above, this keyword does not affect the way the compiler handles RETVAL. +=head2 The XSUB Cleanup Part - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - INIT: - printf("# Host is %s\n", host ); - OUTPUT: - timep +Following an XSUB's output part, where code will have been planted to +return the value of C and C/C parameters, it's +possible to inject some final clean-up code by using the C +keyword. -Another use for the INIT: section is to check for preconditions before -making a call to the C function: +Note that the keywords described in L and L may also appear in this part. - long long - lldiv(a,b) - long long a - long long b - INIT: - if (a == 0 && b == 0) - XSRETURN_UNDEF; - if (b == 0) - croak("lldiv: cannot divide by 0"); - -=head2 The NO_INIT Keyword - -The NO_INIT keyword is used to indicate that a function -parameter is being used only as an output value. The B -compiler will normally generate code to read the values of -all function parameters from the argument stack and assign -them to C variables upon entry to the function. NO_INIT -will tell the compiler that some parameters will be used for -output rather than for input and that they will be handled -before the function terminates. - -The following example shows a variation of the rpcb_gettime() function. -This function uses the timep variable only as an output variable and does -not care about its initial contents. - - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep = NO_INIT - OUTPUT: - timep - -=head2 The TYPEMAP: Keyword - -Starting with Perl 5.16, you can embed typemaps into your XS code -instead of or in addition to typemaps in a separate file. Multiple -such embedded typemaps will be processed in order of appearance in -the XS code and like local typemap files take precedence over the -default typemap, the embedded typemaps may overwrite previous -definitions of TYPEMAP, INPUT, and OUTPUT stanzas. The syntax for -embedded typemaps is - - TYPEMAP: < keyword must appear in the first column of a -new line. - -Refer to L for details on writing typemaps. - -=head2 Initializing Function Parameters - -C function parameters are normally initialized with their values from -the argument stack (which in turn contains the parameters that were -passed to the XSUB from Perl). The typemaps contain the -code segments which are used to translate the Perl values to -the C parameters. The programmer, however, is allowed to -override the typemaps and supply alternate (or additional) -initialization code. Initialization code starts with the first -C<=>, C<;> or C<+> on a line in the INPUT: section. The only -exception happens if this C<;> terminates the line, then this C<;> -is quietly ignored. - -The following code demonstrates how to supply initialization code for -function parameters. The initialization code is eval'ed within double -quotes by the compiler before it is added to the output so anything -which should be interpreted literally [mainly C<$>, C<@>, or C<\\>] -must be protected with backslashes. The variables C<$var>, C<$arg>, -and C<$type> can be used as in typemaps. - - bool_t - rpcb_gettime(host,timep) - char *host = (char *)SvPVbyte_nolen($arg); - time_t &timep = 0; - OUTPUT: - timep - -This should not be used to supply default values for parameters. One -would normally use this when a function parameter must be processed by -another library function before it can be used. Default parameters are -covered in the next section. - -If the initialization begins with C<=>, then it is output in -the declaration for the input variable, replacing the initialization -supplied by the typemap. If the initialization -begins with C<;> or C<+>, then it is performed after -all of the input variables have been declared. In the C<;> -case the initialization normally supplied by the typemap is not performed. -For the C<+> case, the declaration for the variable will include the -initialization from the typemap. A global -variable, C<%v>, is available for the truly rare case where -information from one initialization is needed in another -initialization. - -Here's a truly obscure example: - - bool_t - rpcb_gettime(host,timep) - time_t &timep; /* \$v{timep}=@{[$v{timep}=$arg]} */ - char *host + SvOK($v{timep}) ? SvPVbyte_nolen($arg) : NULL; - OUTPUT: - timep +=head3 The CLEANUP: Keyword -The construct C<\$v{timep}=@{[$v{timep}=$arg]}> used in the above -example has a two-fold purpose: first, when this line is processed by -B, the Perl snippet C<$v{timep}=$arg> is evaluated. Second, -the text of the evaluated snippet is output into the generated C file -(inside a C comment)! During the processing of C line, -C<$arg> will evaluate to C, and C<$v{timep}> will evaluate to -C. + char * + foo(int a) + CODE: + RETVAL = get_foo(a); + OUTPUT: + RETVAL + CLEANUP: + free(RETVAL); /* assuming get_foo() returns a malloced buffer */ + +The C keyword allows a block of code to be inserted directly +after any output code which has been generated automatically or via the +C keyword. It can be used when an XSUB requires special clean-up +procedures before it terminates. The code specified for the clean-up block +will be added as the last statements in the XSUB before the final +C or similar. + +=head2 XSUB Generic Keywords + +There are a few per-XSUB keywords which can appear anywhere within the +body of an XSUB. This is because they affect how the XSUB is registered +with the Perl interpreter, rather than affecting how the C code of the +XSUB itself is generated. These are described in the following +subsections. In addition there are a few more generic keywords which are +described later under L. + +On aesthetic grounds, it is best to use these keywords near the start of +the XSUB. -=head2 Default Parameter Values +=head3 The PROTOTYPE: Keyword -Default values for XSUB arguments can be specified by placing an -assignment statement in the parameter list. The default value may -be a number, a string or the special string C. Defaults should -always be used on the right-most parameters only. + int + foo1(int a, int b = 0) + # this XSUB gets an auto-generated '$;$' prototype + PROTOTYPE: ENABLE -To allow the XSUB for rpcb_gettime() to have a default host -value the parameters to the XSUB could be rearranged. The -XSUB will then call the real rpcb_gettime() function with -the parameters in the correct order. This XSUB can be called -from Perl with either of the following statements: + int + foo2(int a, int b) + # this XSUB doesn't get a prototype + PROTOTYPE: DISABLE - $status = rpcb_gettime( $timep, $host ); + int + foo3(SV* a, int b) + # this XSUB gets the specified prototype: + PROTOTYPE: \@$ - $status = rpcb_gettime( $timep ); + int + foo4(int a, int b) + # this XSUB gets a blank () prototype + PROTOTYPE: -The XSUB will look like the code which follows. A CODE: -block is used to call the real rpcb_gettime() function with -the parameters in the correct order for that function. +While the file-scoped C keyword turns automatic prototype +generation on or off for all subsequent XSUBs, the per-XSUB C +keyword overrides the setting for just the current XSUB. See the +L section for details of what a +prototype is, and why you rarely need one. - bool_t - rpcb_gettime(timep,host="localhost") - char *host - time_t timep = NO_INIT - CODE: - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL - -=head2 The PREINIT: Keyword - -The PREINIT: keyword allows extra variables to be declared immediately -before or after the declarations of the parameters from the INPUT: section -are emitted. - -If a variable is declared inside a CODE: section it will follow any typemap -code that is emitted for the input parameters. This may result in the -declaration ending up after C code, which is C syntax error. Similar -errors may happen with an explicit C<;>-type or C<+>-type initialization of -parameters is used (see L<"Initializing Function Parameters">). Declaring -these variables in an INIT: section will not help. - -In such cases, to force an additional variable to be declared together -with declarations of other variables, place the declaration into a -PREINIT: section. The PREINIT: keyword may be used one or more times -within an XSUB. - -The following examples are equivalent, but if the code is using complex -typemaps then the first example is safer. - - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; - CODE: - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL - -For this particular case an INIT: keyword would generate the -same C code as the PREINIT: keyword. Another correct, but error-prone example: - - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - CODE: - char *host = "localhost"; - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL - -Another way to declare C is to use a C block in the CODE: section: - - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - CODE: - { - char *host = "localhost"; - RETVAL = rpcb_gettime( host, &timep ); - } - OUTPUT: - timep - RETVAL - -The ability to put additional declarations before the typemap entries are -processed is very handy in the cases when typemap conversions manipulate -some global state: - - MyObject - mutate(o) - PREINIT: - MyState st = global_state; - INPUT: - MyObject o; - CLEANUP: - reset_to(global_state, st); - -Here we suppose that conversion to C in the INPUT: section and from -MyObject when processing RETVAL will modify a global variable C. -After these conversions are performed, we restore the old value of -C (to avoid memory leaks, for example). - -There is another way to trade clarity for compactness: INPUT sections allow -declaration of C variables which do not appear in the parameter list of -a subroutine. Thus the above code for mutate() can be rewritten as - - MyObject - mutate(o) - MyState st = global_state; - MyObject o; - CLEANUP: - reset_to(global_state, st); - -and the code for rpcb_gettime() can be rewritten as - - bool_t - rpcb_gettime(timep) - time_t timep = NO_INIT - char *host = "localhost"; - C_ARGS: - host, &timep - OUTPUT: - timep - RETVAL - -=head2 The SCOPE: Keyword - -The SCOPE: keyword allows scoping to be enabled for a particular XSUB. -Its effect is to wrap the main body of the XSUB (i.e. the C or -C or implicit) with an C and C pair. This has the -effect of clearing any accumulated savestack entries at the end of the -code body. It is disabled by default. - -The SCOPE keyword may appear either within the XSUB body (anywhere before -a C could appear), or just before the XSUB declaration, but part of -the same paragraph (i.e. no intervening blank lines). For example: +This keyword's value can be either one of C/C to turn on +or off automatic prototype generation, or it can specify an explicit +prototype string, including the empty prototype. - void - foo() - INPUT: - ... - PREINIT: - ... - SCOPE: ENABLE - CODE: - ... +=head3 The OVERLOAD: Keyword + MODULE = Foo PACKAGE = Foo::Bar - SCOPE: ENABLE - void - bar() + SV* + subtract(SV* a, SV* b, bool swap) + OVERLOAD: - -= + CODE: ... -The first form (within the XSUB body) has been available since perl-5.004, -but was broken by perl-5.12.0 (xsubpp v2.21) and fixed in perl-5.44.0 -(xsubpp v3.58). The second form has been available since perl-5.12.0 . - -Note that to support potentially complex type mappings, if a typemap entry -used by an XSUB contains a comment like C, then scoping will -be automatically enabled for any XSUB which uses that typemap entry for an -C parameter. This currently only works for parameters whose type -is specified in a separate C line rather than any ANSI-style -declaration (C). - -=head2 The INPUT: Keyword - -The XSUB's parameters are usually evaluated immediately after entering the -XSUB. The INPUT: keyword can be used to force those parameters to be -evaluated a little later. The INPUT: keyword can be used multiple times -within an XSUB and can be used to list one or more input variables. This -keyword is used with the PREINIT: keyword. - -The following example shows how the input parameter C can be -evaluated late, after a PREINIT. - - bool_t - rpcb_gettime(host,timep) - char *host - PREINIT: - time_t tt; - INPUT: - time_t timep - CODE: - RETVAL = rpcb_gettime( host, &tt ); - timep = tt; - OUTPUT: - timep - RETVAL - -The next example shows each input parameter evaluated late. - - bool_t - rpcb_gettime(host,timep) - PREINIT: - time_t tt; - INPUT: - char *host - PREINIT: - char *h; - INPUT: - time_t timep - CODE: - h = host; - RETVAL = rpcb_gettime( h, &tt ); - timep = tt; - OUTPUT: - timep - RETVAL - -Since INPUT sections allow declaration of C variables which do not appear -in the parameter list of a subroutine, this may be shortened to: - - bool_t - rpcb_gettime(host,timep) - time_t tt; - char *host; - char *h = host; - time_t timep; - CODE: - RETVAL = rpcb_gettime( h, &tt ); - timep = tt; - OUTPUT: - timep - RETVAL +The C keyword allows you to declare that this XSUB acts as an +overload method for the specified operators in the current package. The +example above is approximately equivalent to this Perl code: + + package Foo::Bar; -(We used our knowledge that input conversion for C is a "simple" one, -thus C is initialized on the declaration line, and our assignment -C is not performed too early. Otherwise one would need to have the -assignment C in a CODE: or INIT: section.) + sub subtract { ... } -=head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords + use overload + '-' => \&subtract, + '-=' => \&subtract; -In the list of parameters for an XSUB, one can precede parameter names -by the C/C/C/C/C keywords. -C keyword is the default, the other keywords indicate how the Perl -interface should differ from the C interface. +The rest of the line following the keyword, plus any further lines until +the next keyword, are interpreted as a space-separated list of overloaded +operators. There is no check that they are valid operator names. The names +and symbols will eventually end up within double-quoted strings +in the C file, so double-quotes need to be escaped; in particular: -Parameters preceded by C/C/C/C -keywords are considered to be used by the C subroutine I. C/C keywords indicate that the C subroutine -does not inspect the memory pointed by this parameter, but will write -through this pointer to provide additional return values. + OVERLOAD: \"\" -Parameters preceded by C keyword do not appear in the usage -signature of the generated Perl function. +This could be regarded as a bug. -Parameters preceded by C/C/C I appear as -parameters to the Perl function. With the exception of -C-parameters, these parameters are converted to the corresponding -C type, then pointers to these data are given as arguments to the C -function. It is expected that the C function will write through these -pointers. +XSUBs used for overload methods are invoked with the same arguments as +Perl subroutines would be: for example, an overloaded binary operator will +trigger a call to the XSUB method with the first argument being an +overloaded object representing one of the two operands of the binary +operator; the second being the other operand (which may or may not be an +object); and third, a swap flag. See L for the full details +of how these functions will be called, with what arguments. Note that +C can in fact be undef in addition to false, to indicate an assign +overload such as C<+=>. -The return list of the generated Perl function consists of the C return value -from the function (unless the XSUB is of C return type or -C was used) followed by all the C -and C parameters (in the order of appearance). On the -return from the XSUB the C/C Perl parameter will be -modified to have the values written by the C function. +Bitwise operator methods sometimes take extra arguments: in +particular under C. So you may want to use an +ellipsis (something like C<(lobj, robj, swap, ...)>) to skip them. -For example, an XSUB +The net effect of the C keyword is to add some extra code to +the boot XSUB to register this XSUB as the handler for the specified +overload actions, in the same way that C does for Perl +methods. - void - day_month(OUTLIST day, IN unix_time, OUTLIST month) - int day - int unix_time - int month +See also the file-scoped L keyword for +details of how to set the fallback behaviour for the current package. -should be used from Perl as +Note that C shouldn't be mixed with the L keyword; the value of C will be undefined for any overload +method call. - my ($day, $month) = day_month(time); +The L section contains a fully-worked +example of using the C typemap to wrap a simple arithmetic +library. The result of that wrapper allows you to write Perl code such as: -The C signature of the corresponding function should be + my $i2 = My::Num->new(2); + my $i7 = My::Num->new(7); + my $i13 = My::Num->new(13); - void day_month(int *day, int unix_time, int *month); + my $x = $i13->add($i7)->divide($i2); + printf "val=%d\n", $x->val(); -The C/C/C/C/C keywords can be -mixed with ANSI-style declarations, as in +Using overloading, we would like to be able to write those last two lines +more simply as: - void - day_month(OUTLIST int day, int unix_time, OUTLIST int month) + my $x = ($i13 + $i7)/$i2; + printf "val=%d\n", $x; -(here the optional C keyword is omitted). +The following additions and modifications to that example XS code show how +to add overloading: -The C parameters are identical with parameters introduced with -L and put into the C section (see -L). The C parameters are very similar, -the only difference being that the value C function writes through the -pointer would not modify the Perl parameter, but is put in the output -list. + FALLBACK: UNDEF -The C/C parameter differ from C/C -parameters only by the initial value of the Perl parameter not -being read (and not being given to the C function - which gets some -garbage instead). For example, the same C function as above can be -interfaced with as + int + mynum_val(My::Num x, ...) + OVERLOAD: 0+ - void day_month(OUT int day, int unix_time, OUT int month); + My::Num + mynum_add(My::Num x, My::Num y, bool swap) + OVERLOAD: + + C_ARGS: x, y + INIT: + if (swap) { + mynum* tmp = x; x = y; y = tmp; + } -or + # ... and three similar XSUBs for + # mynum_subtract, mynum_multiply, mynum_divide ... + +The C line isn't actually necessary as this is the default +anyway, but is included to remind you that the keyword can be used. + +Overloading is added to the C method so that it automatically +returns the value of an object when used in a numeric context (such as for +the C above). The ellipsis is added to ignore the extra two +arguments passed to an overload method. + +The original C method which, via aliasing, handled all four +of the arithmetic operations, is now split into four separate XSUBs, since +C and C doesn't mix. + +The main change to each arithmetic XSUB part from adding the C +keyword, is that there is an extra C parameter. There's no real need +to use it for addition and multiplication, but it is important for the +non-commutative subtraction and division operations. + +That example uses the C typemap to process the second argument, +which in the most general usage may not be an object. For example the +second and third of these lines will croak with an C error: + + $i13 + My::Num->new(7); + $i13 + 7; + $i13 + "7"; + +If it is necessary to handle this, then you may need to create your own +typemap: for example, something similar to C, but with an INPUT +template along the lines of: + + T_MYNUM + SV *sv = $arg; + SvGETMAGIC(sv); + if (!SvROK(sv)) { + sv = sv_newmortal(); + sv_setref_pv(sv, "$ntype", mynum_new(SvIV($arg)); + } + .... - void - day_month(day, unix_time, month) - int &day = NO_INIT - int unix_time - int &month = NO_INIT - OUTPUT: - day - month +Finally, although not directly related to XS, the following could be added +to F to allow integer literals to be used directly: -However, the generated Perl function is called in very C-ish style: + sub import { + overload::constant integer => + sub { + my $str = shift; + return My::Num->new($str); + }; + } - my ($day, $month); - day_month($day, time, $month); +which then allows these lines: -=head2 The C Keyword + my $i2 = My::Num->new(2); + my $i7 = My::Num->new(7); + my $i13 = My::Num->new(13); -If one of the input arguments to the C function is the length of a string -argument C, one can substitute the name of the length-argument by -C in the XSUB declaration. This argument must be omitted when -the generated Perl function is called. E.g., +to be rewritten more cleanly as: - void - dump_chars(char *s, short l) - { - short n = 0; - while (n < l) { - printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]); - n++; - } - } + my $i2 = 2; + my $i7 = 7; + my $i13 = 13; - MODULE = x PACKAGE = x +=head3 The ATTRS: Keyword - void dump_chars(char *s, short length(s)) + MODULE = Foo::Bar PACKAGE = Foo::Bar -should be called as C. + SV* + debug() + ATTRS: lvalue + PPCODE: + # return $Foo::Bar::DEBUG, creating it if not already present: + PUSHs(GvSV(gv_fetchpvs("Foo::Bar::DEBUG", GV_ADD, SVt_IV))); -This directive is supported with ANSI-type function declarations only. +The C keyword allows you to apply subroutine attributes to an XSUB +in a similar fashion to Perl subroutines. The XSUB in the example above is +equivalent to this Perl: -=head2 Variable-length Parameter Lists + sub debug :lvalue { return $Foo::Bar::DEBUG } -XSUBs can have variable-length parameter lists by specifying an ellipsis -C<(...)> in the parameter list. This use of the ellipsis is similar to that -found in ANSI C. The programmer is able to determine the number of -arguments passed to the XSUB by examining the C variable which the -B compiler supplies for all XSUBs. By using this mechanism one can -create an XSUB which accepts a list of parameters of unknown length. +and both can be called like this: -The I parameter for the rpcb_gettime() XSUB can be -optional so the ellipsis can be used to indicate that the -XSUB will take a variable number of parameters. Perl should -be able to call this XSUB with either of the following statements. + use Foo::Bar; + Foo::Bar::debug() = 99; + print "$Foo::Bar::DEBUG\n"; # prints 99 - $status = rpcb_gettime( $timep, $host ); +This keyword consumes all lines until the next keyword. The contents of +each line are interpreted as space-separated attributes. The attributes +are applied at the time the XS module is loaded. This: - $status = rpcb_gettime( $timep ); + void + foo(...) + ATTRS: aaa + bbb(x,y) ccc + +is approximately equivalent to: + + use attributes Foo::Bar, \&foo, 'aaa'; + use attributes Foo::Bar, \&foo, 'bbb(x,y)'; + use attributes Foo::Bar, \&foo, 'ccc'; + +User-defined attributes, just like with Perl subs, will trigger a call to +C, as described in L. + +Note that not all built-in subroutine attributes necessarily make sense +applied to XSUBs. + +Currently the parsing of white-space is crude: C is +misinterpreted as two separate attributes, C<'bbb(x,'> and C<'y)'>. + +The C keyword can't currently be used in conjunction with C +or C; in this case, the attributes are just silently ignored. + +=head2 Sharing XSUB bodies + +Sometimes you want to write several XSUBs which are very similar: they +all have the same signature, have the same generated code to convert +arguments and return values between Perl and C, and may only differ in a +few lines in the main body or in which C library function they wrap. It is +in fact possible to share the same XSUB function among multiple Perl CVs. +For example, C<&Foo::Bar::add> and C<&Foo::Bar::subtract> could be two +separate CVs in the Perl namespace which both point to the same XSUB, +C say. But each CV holds some sort of unique +identifier which can be accessed by the XSUB so that it can determine +whether it should behave as C or C. + +Both the C and C keywords (described below) allow +multiple CVs to share the same XSUB. The difference between them is that +C is intended for when you supply the main body of the XSUB +yourself (e.g. using C): it sets an integer variable, C (derived +from the passed CV), which you can use in a C statement or +similar. Conversely, C is intended for use with autocall; +information stored in the CV indicates which C library function should be +autocalled. + +Finally, there is the C keyword, which allows the whole body of an +XSUB (not just the C part) to have alternate cases. It can be +thought of as a C analogue which works at the top-most XS level +rather than at the C level. The value the C acts on could be +C for example, or it could be used in conjunction with the C +keyword and switch on the value of C. + +=head3 The ALIAS: Keyword + + int add(int x, int y) + ALIAS: + # implicit: add = 0 + subtract = 1 + multiply = 2 divide = 3 + CODE: + switch (ix) { ... } -The XS code, with ellipsis, follows. +Note that this keyword can appear anywhere within the body of an XSUB. - bool_t - rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PREINIT: - char *host = "localhost"; - CODE: - if( items > 1 ) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL - -=head2 The C_ARGS: Keyword - -The C_ARGS: keyword allows creating of XSUBS which have different -calling sequence from Perl than from C, without a need to write -CODE: or PPCODE: section. The contents of the C_ARGS: paragraph is -put as the argument to the called C function without any change. - -For example, suppose that a C function is declared as - - symbolic nth_derivative(int n, symbolic function, int flags); - -and that the default flags are kept in a global C variable -C. Suppose that you want to create an interface which -is called as - - $second_deriv = $function->nth_derivative(2); - -To do this, declare the XSUB as - - symbolic - nth_derivative(function, n) - symbolic function - int n - C_ARGS: - n, function, default_flags - -=head2 The PPCODE: Keyword - -The PPCODE: keyword is an alternate form of the CODE: keyword and is used -to tell the B compiler that the programmer is supplying the code to -control the argument stack for the XSUBs return values. Occasionally one -will want an XSUB to return a list of values rather than a single value. -In these cases one must use PPCODE: and then explicitly push the list of -values on the stack. The PPCODE: and CODE: keywords should not be used -together within the same XSUB. - -The actual difference between PPCODE: and CODE: sections is in the -initialization of C macro (which stands for the I Perl -stack pointer), and in the handling of data on the stack when returning -from an XSUB. In CODE: sections SP preserves the value which was on -entry to the XSUB: SP is on the function pointer (which follows the -last parameter). In PPCODE: sections SP is moved backward to the -beginning of the parameter list, which allows C macros -to place output values in the place Perl expects them to be when -the XSUB returns back to Perl. - -The generated trailer for a CODE: section ensures that the number of return -values Perl will see is either 0 or 1 (depending on the Cness of the -return value of the C function, and heuristics mentioned in -L<"The RETVAL Variable">). The trailer generated for a PPCODE: section -is based on the number of return values and on the number of times -C was updated by C<[X]PUSH*()> macros. - -Note that macros C, C and C work equally -well in CODE: sections and PPCODE: sections. - -The following XSUB will call the C rpcb_gettime() function -and will return its two output values, timep and status, to -Perl as a single list. - - void - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - bool_t status; - PPCODE: - status = rpcb_gettime( host, &timep ); - EXTEND(SP, 2); - PUSHs(sv_2mortal(newSViv(status))); - PUSHs(sv_2mortal(newSViv(timep))); - -Notice that the programmer must supply the C code necessary -to have the real rpcb_gettime() function called and to have -the return values properly placed on the argument stack. - -The C return type for this function tells the B compiler that -the RETVAL variable is not needed or used and that it should not be created. -In most scenarios the void return type should be used with the PPCODE: -directive. - -The EXTEND() macro is used to make room on the argument -stack for 2 return values. The PPCODE: directive causes the -B compiler to create a stack pointer available as C, and it -is this pointer which is being used in the EXTEND() macro. -The values are then pushed onto the stack with the PUSHs() -macro. - -Now the rpcb_gettime() function can be used from Perl with -the following statement. - - ($status, $timep) = rpcb_gettime("localhost"); - -When handling output parameters with a PPCODE section, be sure to handle -'set' magic properly. See L for details about 'set' magic. - -=head2 Returning Undef And Empty Lists - -Occasionally the programmer will want to return simply -C or an empty list if a function fails rather than a -separate status value. The rpcb_gettime() function offers -just this situation. If the function succeeds we would like -to have it return the time and if it fails we would like to -have undef returned. In the following Perl code the value -of $timep will either be undef or it will be a valid time. - - $timep = rpcb_gettime( "localhost" ); - -The following XSUB uses the C return type as a mnemonic only, -and uses a CODE: block to indicate to the compiler -that the programmer has supplied all the necessary code. The -sv_newmortal() call will initialize the return value to undef, making that -the default return value. - - SV * - rpcb_gettime(host) - char * host - PREINIT: - time_t timep; - bool_t x; - CODE: - ST(0) = sv_newmortal(); - if( rpcb_gettime( host, &timep ) ) - sv_setnv( ST(0), (double)timep); - -The next example demonstrates how one would place an explicit undef in the -return value, should the need arise. - - SV * - rpcb_gettime(host) - char * host - PREINIT: - time_t timep; - bool_t x; - CODE: - if( rpcb_gettime( host, &timep ) ){ - ST(0) = sv_newmortal(); - sv_setnv( ST(0), (double)timep); - } - else{ - ST(0) = &PL_sv_undef; - } - -To return an empty list one must use a PPCODE: block and -then not push return values on the stack. - - void - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - PPCODE: - if( rpcb_gettime( host, &timep ) ) - PUSHs(sv_2mortal(newSViv(timep))); - else{ - /* Nothing pushed on stack, so an empty - * list is implicitly returned. */ - } - -Some people may be inclined to include an explicit C in the above -XSUB, rather than letting control fall through to the end. In those -situations C should be used, instead. This will ensure that -the XSUB stack is properly adjusted. Consult L for other -C macros. - -Since C macros can be used with CODE blocks as well, one can -rewrite this example as: - - int - rpcb_gettime(host) - char *host - PREINIT: - time_t timep; - CODE: - RETVAL = rpcb_gettime( host, &timep ); - if (RETVAL == 0) - XSRETURN_UNDEF; - OUTPUT: - RETVAL - -In fact, one can put this check into a POSTCALL: section as well. Together -with PREINIT: simplifications, this leads to: - - int - rpcb_gettime(host) - char *host - time_t timep; - POSTCALL: - if (RETVAL == 0) - XSRETURN_UNDEF; - -=head2 The REQUIRE: Keyword - -The REQUIRE: keyword is used to indicate the minimum version of the -B compiler needed to compile the XS module. An XS module which -contains the following statement will compile with only B version -1.922 or greater: - - REQUIRE: 1.922 - -=head2 The CLEANUP: Keyword - -This keyword can be used when an XSUB requires special cleanup procedures -before it terminates. When the CLEANUP: keyword is used it must follow -any CODE:, or OUTPUT: blocks which are present in the XSUB. The code -specified for the cleanup block will be added as the last statements in -the XSUB. +The C keyword allows a single XSUB to have two or more Perl names +and to know which of those names was used when it was invoked. Each alias +is given an integer index value, with the main name of the XSUB being +index 0. This index is accessible via the variable C which is +initialised based on which CV (i.e. which Perl subroutine) was called. -=head2 The POSTCALL: Keyword +Note that an XSUB may be shared by multiple CVs, and each CV may have +multiple names. Given the C XSUB definition above, and given this +Perl code: -This keyword can be used when an XSUB requires special procedures -executed after the C subroutine call is performed. When the POSTCALL: -keyword is used it must precede OUTPUT: and CLEANUP: blocks which are -present in the XSUB. + use Foo::Bar; + BEGIN { *addition = *add } -See examples in L<"The NO_OUTPUT Keyword"> and L<"Returning Undef And Empty Lists">. +Then in the C namespace, the entries C and C +point to the same CV, which has index 0 stored in it; while C +points to a second CV with index 1, and so on. All four CVs point to the +same C function, C. -The POSTCALL: block does not make a lot of sense when the C subroutine -call is supplied by user by providing either CODE: or PPCODE: section. +The alias name can be either a simple function name or can include a +package name. The alias value to the right of the C<=> may be either a +literal positive integer or a word (which is expected to be a CPP define). -=head2 The BOOT: Keyword +The rest of the line following the C keyword, plus any further +lines until the next keyword, are assumed to contain zero or more alias +name and value pairs. -The BOOT: keyword is used to add code to the extension's bootstrap -function. The bootstrap function is generated by the B compiler and -normally holds the statements necessary to register any XSUBs with Perl. -With the BOOT: keyword the programmer can tell the compiler to add extra -statements to the bootstrap function. +A warning will be produced if you create more than one alias to the same +index value. If you want multiple aliases with the same value, then a +backwards-compatible way of achieving this is via separate CPP defines to +the same value, e.g. -This keyword may be used any time after the first MODULE keyword and should -appear on a line by itself. The first blank line after the keyword will -terminate the code block. + #define DIVIDE 3 + #define DIVISION 3 - BOOT: - # The following message will be printed when the - # bootstrap function executes. - printf("Hello from the bootstrap!\n"); + ALIAS: + divide = DIVIDE + division = DIVISION -=head2 The VERSIONCHECK: Keyword +Since Perl 5.38.0 or C 3.51, alias values may refer to +other alias names (or to the main function name) by using C<< => >> rather +than the C<=> symbol: -The VERSIONCHECK: keyword corresponds to B's C<-versioncheck> and -C<-noversioncheck> options. This keyword overrides the command line -options. Version checking is enabled by default. When version checking is -enabled the XS module will attempt to verify that its version matches the -version of the PM module. + ALIAS: + divide = 3 + division => divide -To enable version checking: +Both alias names and C<< => >> values may be fully-qualified: - VERSIONCHECK: ENABLE + ALIAS: + red = 1 + COLOR::red => red + COLOUR::red => COLOR::red -To disable version checking: +Note that any L is applied to the main +name of the XSUB, but not to any aliases. - VERSIONCHECK: DISABLE +See L for a fully-worked example using +aliases. -Note that if the version of the PM module is an NV (a floating point -number), it will be stringified with a possible loss of precision -(currently chopping to nine decimal places) so that it may not match -the version of the XS module anymore. Quoting the $VERSION declaration -to make it a string is recommended if long version numbers are used. +See L below for an alternative to +C which is more suited for autocall. Note that C should not +be used together with either of C or C. -=head2 The PROTOTYPES: Keyword +=head3 The INTERFACE: Keyword -The PROTOTYPES: keyword corresponds to B's C<-prototypes> and -C<-noprototypes> options. This keyword overrides the command line options. -Prototypes are disabled by default. When prototypes are enabled, XSUBs will -be given Perl prototypes. This keyword may be used multiple times in an XS -module to enable and disable prototypes for different parts of the module. -Note that B will nag you if you don't explicitly enable or disable -prototypes, with: + MODULE = Foo::Bar PACKAGE = Foo::Bar PREFIX = foobar_ - Please specify prototyping behavior for Foo.xs (see perlxs manual) + int + arith(int a, int b) + INTERFACE: foobar_add foobar_subtract + foobar_divide foobar_multiply + +This keyword can appear anywhere within the L of an XSUB. + +This keyword provides similar functionality to C, but is intended +for XSUBs which use autocall. It allows a single XSUB to have multiple +names in the Perl namespace which, when invoked, will call the correct +wrapped C library function. + +In the example above there is a single C XSUB function created (called +C), plus four CVs in the Perl namespace called +C etc. Calling C from Perl invokes +C with some indication of which C function to call, +which is then autocalled. C achieves this by storing an index value +in each CV and making it available via the C variable, while +C currently achieves this by storing a C function pointer in +each CV. So the C CV holds a pointer to the +C C function. The action of the XSUB is to extract the +parameter values from the passed arguments and the function pointer from +the CV, then call the underlying C function. + +Note that storing a function pointer in the CV is an implementation detail +which could change in the future. See L for +details of how to customise the setting and retrieving of this value in +the CV. + +The rest of the line following the C keyword, plus any further +lines until the next keyword, are assumed to contain zero or more +interface names, separated by white space (or commas). + +An interface name is always used as-is for the name of the wrapped C +function. If the name contains a package separator, then it will be +used as-is to generate the Perl name; otherwise any prefix is stripped and +the current package name is prepended. The following shows how a few such +interface names would be processed (assuming the current PACKAGE and +PREFIX are C and C): + + Interface name Perl function name C function name + -------------- ------------------ ---------------- + abc Foo::Bar::abc abc + foobar_abc Foo::Bar::abc foobar_abc + X::Y::foobar_def X::Y::foobar_def X::Y::foobar_def + +Unlike C, the XSUB name is used only as the name of the generated C +function; in the example above, it doesn't cause a Perl function called +C to be created. + +See L for a complete example using +C with the C typemap. But note that before Perl +5.44.0 (F 3.60), C would not work properly +on XSUBs used with Perlish return types (as used by C), such as + + Foo::Bar + foo(...) + .... + +This has mostly been fixed in 5.44.0 onwards, but may generate invalid C +code (in particular, invalid function pointer casts) for XSUBs having a +C keyword, unless the value of C is a simple list of +parameter names. + +Note that C should not be used together with either of C +or C. + +=head4 The INTERFACE_MACRO: Keyword -To enable prototypes: + int + arith(int a, int b) + INTERFACE: add subtract divide multiply + INTERFACE_MACRO: MY_FUNC_GET + MY_FUNC_SET - PROTOTYPES: ENABLE +Note that this keyword is deprecated since it assumes a particular +implementation for the C keyword, which might change in future. -To disable prototypes: +This keyword can appear anywhere within the L +or L parts of an XSUB. - PROTOTYPES: DISABLE +By default, the C code generated by the C keyword plants calls +to two macros, C and C, which are +used respectively to set (at boot time) a field in the CV to the address +of the C function pointer to use, and to retrieve (at run time) that value +from the CV. -=head2 The PROTOTYPE: Keyword +The C macro allows you to override the names of the two +macros to be used for this purpose. The rest of the line following the +C keyword, plus any further lines until the next keyword, +should contain (in total) two words which are taken to be macro names. -This keyword is similar to the PROTOTYPES: keyword above but can be used to -force B to use a specific prototype for the XSUB. This keyword -overrides all other prototype options and keywords but affects only the -current XSUB. Consult L for information about Perl -prototypes. +The get macro takes three parameters: the return type of the function, the +CV which holds the function's pointer value, and the field within the CV +which has the pointer value. It should return a C function pointer. The +setter macro has two parameters: the CV, and the function pointer. - bool_t - rpcb_gettime(timep, ...) - time_t timep = NO_INIT - PROTOTYPE: $;$ - PREINIT: - char *host = "localhost"; - CODE: - if( items > 1 ) - host = (char *)SvPVbyte_nolen(ST(1)); - RETVAL = rpcb_gettime( host, &timep ); - OUTPUT: - timep - RETVAL +Suppose that in the example above, pointers to the C, +C, C and C functions are kept in a global C +array called C with offsets specified by the enum values +C, C, C and C. Then one +could use: -If the prototypes are enabled, you can disable it locally for a given -XSUB as in the following example: + #define MY_FUNC_GET(ret, cv, f) \ + ((XSINTERFACE_CVT_ANON(ret))arith_ptrs[CvXSUBANY(cv).any_i32]) + #define MY_FUNC_SET(cv, f) \ + CvXSUBANY(cv).any_i32 = CAT2(f, _off) - void - rpcb_gettime_noproto() - PROTOTYPE: DISABLE - ... +to store an array index in the CV, rather than storing the actual function +pointer. -=head2 The ALIAS: Keyword - -The ALIAS: keyword allows an XSUB to have two or more unique Perl names -and to know which of those names was used when it was invoked. The Perl -names may be fully-qualified with package names. Each alias is given an -index. The compiler will setup a variable called C which contain the -index of the alias which was used. When the XSUB is called with its -declared name C will be 0. - -The following example will create aliases C and -C for this function. - - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - INIT: - printf("# ix = %d\n", ix ); - OUTPUT: - timep - -A warning will be produced when you create more than one alias to the same -value. This may be worked around in a backwards compatible way by creating -multiple defines which resolve to the same value, or with a modern version -of ExtUtils::ParseXS you can use a symbolic alias, which are denoted with -a C<< => >> instead of a C<< = >>. For instance you could change the above -so that the alias section looked like this: - - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - BAZ::gettime => FOO::gettime - -this would have the same effect as this: - - ALIAS: - FOO::gettime = 1 - BAR::getit = 2 - BAZ::gettime = 1 - -except that the latter will produce warnings during the build process. A -mechanism that would work in a backwards compatible way with older -versions of our tool chain would be to do this: - - #define FOO_GETTIME 1 - #define BAR_GETIT 2 - #define BAZ_GETTIME 1 - - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - ALIAS: - FOO::gettime = FOO_GETTIME - BAR::getit = BAR_GETIT - BAZ::gettime = BAZ_GETTIME - INIT: - printf("# ix = %d\n", ix ); - OUTPUT: - timep +=head3 The CASE: Keyword -=head2 The OVERLOAD: Keyword + int + foo(int a, int b = NO_INIT, int c = NO_INIT) + CASE: items == 1 + C_ARGS: 0, a + CASE: items == 2 + C_ARGS: b, a + CASE: + CODE: + RETVAL = b > c ? foo(b, a) : bar(b, a); + OUTPUT: + RETVAL + + +The C keyword allows an XSUB to effectively have multiple bodies, +but with only a single Perl name (unlike C, which has multiple +names). Which body is run depends on which CASE expression is the first to +evaluate to true. Unlike C's C keyword, execution doesn't fall +though to the next branch, so there is no XS equivalent of the C +keyword. The expression for the last CASE is optional, and if not present, +acts as a default branch. + +The example above translates to approximately this C code: + + if (items < 1 || items > 3) { croak("..."); } + + if (items == 1) { + int RETVAL; + int a = (int)SvIV(ST(0)); int b = /* etc */ + RETVAL = foo(0, a); + /* ... return RETVAL as ST(0) ... */ + } + else if (items == 2) { + int RETVAL; + int a = (int)SvIV(ST(0)); int b = /* etc */ + RETVAL = foo(b, a); + /* ... return RETVAL as ST(0) ... */ + } + else { + int RETVAL; + int a = (int)SvIV(ST(0)); int b = /* etc */ + RETVAL = b > c ? foo(b, a) : bar(b, a); + /* ... return RETVAL as ST(0) ... */ + } -Instead of writing an overloaded interface using pure Perl, you -can also use the OVERLOAD keyword to define additional Perl names -for your functions (like the ALIAS: keyword above). However, the -overloaded functions must be defined in such a way as to accept the number -of parameters supplied by perl's overload system. For most overload -methods, it will be three parameters; for the C function it will -be four. However, the bitwise operators C<&>, C<|>, C<^>, and C<~> may be -called with three I five arguments (see L). + XSRETURN(1); + +Each C keyword precedes an entire normal XSUB body, including all +keywords from C to C. Generic XSUB keywords can be +placed within any C body. The code generated for each C/C +branch includes nearly all the code that would usually be generated for a +complete XSUB body, including argument processing and return value +stack processing. + +Note that the CASE expressions are outside of the scope of any parameter +variable declarations, so those values can't be used. Typical values which +I in scope and might be used are the C variable which +indicates how many arguments were passed (see L<"Ellipsis: variable-length +parameter lists">) and, in the presence of C, the C variable. + +Here's another example, this time in conjunction with C to wrap the +same C function as two separate Perl functions, the second of which +(perhaps for backwards compatibility reasons) takes its arguments in the +reverse order. This is a somewhat contrived example, but +demonstrates how the C keyword must be within one of the C +branches (it doesn't matter which), as C must always appear in the +outermost scope of the XSUB's body: -If any -function has the OVERLOAD: keyword, several additional lines -will be defined in the c file generated by xsubpp in order to -register with the overload magic. + int + foo(int a, int b) + CASE: ix == 0 + CASE: ix == 1 + ALIAS: foo_rev = 1 + C_ARGS: b, a -Since blessed objects are actually stored as RV's, it is useful -to use the typemap features to preprocess parameters and extract -the actual SV stored within the blessed RV. See the sample for -T_PTROBJ_SPECIAL in L. +Note that using old-style parameter declarations in conjunction with +C allows the types of the parameters to vary between branches: -To use the OVERLOAD: keyword, create an XS function which takes -three input parameters (or use the C-style '...' definition) like -this: + int + foo(a, int b = 0) + CASE: items == 1 + INPUT: + short a + CASE: items == 2 + INPUT: + long a + +In practice, C produces bloated code with all the argument and +return value processing duplicated within each branch, is not often all +that useful, and can often be better written just by using a C +statement within a C block. + +=head2 Using Typemaps + +This section describes the basic facts about using typemaps. For full +information on creating your own typemaps plus a comprehensive list of +what standard typemaps are available, see the L document. + +Typemaps are sets of rules which map C types such as C to logical XS +types such as C, and from there to C and C templates +such as C<$var = ($type)SvIV($arg)> and C which, +after variable expansion, generate C code to convert back and forth +between Perl arguments and C auto variables. + +There is a standard system typemap file bundled with Perl for common C and +Perl types, but in addition, you can add your own typemap file. From Perl +5.16.0 onwards you can also include extra typemap declarations in-line +within the XS file. + +=head3 Locations and ordering of typemap processing + +Typemap definitions are processed in order, with more recent entries +overriding any earlier ones. Definitions are read in first from files and +then from L sections in the XS file. + +When considering how files are located and read in, note that the XS +parser will initially change directory to the directory containing the +F file that is about to be processed, which will affect any +subsequent relative paths. Then any typemap files are located and read in. +The files come from two sources: standard and explicit. + +Standard typemap files are always called C and are searched for +in a standard set of locations (relative to C<@INC> and to the current +directory), and any matched files are read in. These paths are, in order +of processing: + + "$_/ExtUtils/typemap" for reverse @INC + + ../../../../lib/ExtUtils/typemap + ../../../../typemap + ../../../lib/ExtUtils/typemap + ../../../typemap + ../../lib/ExtUtils/typemap + ../../typemap + ../lib/ExtUtils/typemap + ../typemap + typemap + +Note that searching C<@INC> in reverse order means that typemap files +found earlier in C<@INC> are processed later, and thus have higher +priority. + +Explicit typemap files are specified either via C +command line switches, or programmatically by an array passed as: + + ExtUtils::ParseXS::process_file(..., typemap => ['foo',...]); + +These files are read in order, and the parser dies if any explicitly +listed file is not found. + +Prior to Perl 5.10.0 and Perl 5.8.9, C<@INC> wasn't searched, and standard +files were searched for and processed I any explicit ones. From +Perl 5.10.0 onwards, standard files were processed I any explicit +ones. From Perl 5.44.0 (F 3.60) onwards, explicit files +are again processed last, and thus take priority over standard files. +In Perl 5.16.0 onwards, C sections are then processed in order +after all files have been processed. + +Note also that F usually invokes F with two +C<-typemap> arguments: the first being the system typemap and the second +being the module's typemap file, if any. This compensates for older Perls +not searching C<@INC>. + +For a typical distribution, all this complication usually results in the +typemap file bundled with Perl being read in first, then the typemap file +included with the distribution adding to (and overriding) any standard +definitions, then any C entries in the XS file overriding +everything. + +=head3 Reusing, redefining and adding typemap entries + +Both typemap files and C blocks can have up to three sections: +C (which is implicit at the start of the file or block) and +C and C. There is no requirement for all three sections to +be present. Whatever I present is added to the global state for that +section, either adding a new entry or redefining an existing entry. + +Probably the simplest use of an additional typemap entry is to map a new C +type to an I XS type; for example, given this C type: + + typedef enum { red, green, blue } colors; + +then adding the following C-to-XS type-mapping entry to the typemap would +be sufficient if you just want to treat such enums as simple integers when +used as parameter and return types: - SV * - cmp (lobj, robj, swap) - My_Module_obj lobj - My_Module_obj robj - IV swap - OVERLOAD: cmp <=> - { /* function defined here */} + colors T_IV + +Or you could override just an existing INPUT or OUTPUT template; for +example: -In this case, the function will overload both of the three way -comparison operators. For all overload operations using non-alpha -characters, you must type the parameter without quoting, separating -multiple overloads with whitespace. Note that "" (the stringify -overload) should be entered as \"\" (i.e. escaped). + OUTPUT + T_IV + my_sv_setiv($arg, (IV)$var); -Since, as mentioned above, bitwise operators may take extra arguments, you -may want to use something like C<(lobj, robj, swap, ...)> (with -literal C<...>) as your parameter list. +For a completely novel type you might want to add an entry to all three +sections: -=head2 The FALLBACK: Keyword + foo T_FOO -In addition to the OVERLOAD keyword, if you need to control how -Perl autogenerates missing overloaded operators, you can set the -FALLBACK keyword in the module header section, like this: + INPUT + T_FOO + $var = ($type)get_foo_from_sv($arg); - MODULE = RPC PACKAGE = RPC + OUTPUT + T_FOO + set_sv_to_foo($arg, $var); + +=head3 Common typemaps + +This section gives an overview of what common typemap entries are +available for use. See the L document for a complete list, +or examine the F file which is bundled with the Perl +distribution. Also, see L for a detailed +dive into one particular typemap which is particularly useful for mapping +between Perl objects and C handles. See L +for a general discussion about returning one or more values from an XSUB, +where typemaps can sometimes be of use (and sometimes aren't). + +Standard signed C int types such as C, C and C, are all +mapped to to the C XS type. Integer-like Perl types such C and +C are also mapped to this. If a parameter is declared as something +mapping to C, then the C value of the passed SV will be +extracted (perhaps first converting a string value like C<"123"> to an +IV), then that value will be cast to the final C type, with the usual C +rules for casting between integer types. Conversely, when returning a +value, the C value is first cast to C, then the SV is set to that +IV value. + +Similarly, common C and Perl unsigned types map to C, and values +are converted back and forth via C<(UV)> casts. A few unsigned types such +as C and C are instead mapped to C and C XS +types, but these have the same effect as C. + +The C type is treated similarly to other C types, but +C is treated as a string rather than an integer. A C parameter +will treat its passed argument as a string and set the auto variable to +the first I of that string (which may produce weird results with +UTF-8 strings). Returning a C value will return a one-character +string to the Perl caller. + +The C type and its common variants are mapped to C. Passed +parameters will (via C or similar) return a string buffer +representing that SV. This buffer may be part of the SV if that SV has a +string value (or if it can be converted to a string value), or it may be a +temporary buffer otherwise. For example, an SV holding a reference to an +array might return a temporary string buffer with the value +C<"ARRAY(0x12345678)">. When an XSUB has a return type which maps to +C, the temporary SV which is to be returned gets assigned the +current value of C, with the string's length being determined by +C or its equivalent. + +See L for the difficulties associated with handling +UTF-8 strings. + +The C, C and C types map to C, C and +C XS types, which all operate by converting to and from an SV via +C and C with suitable casting. + +The C type maps to C, which basically does no processing and +allows you to access the actual passed SV argument. + +=head3 T_PTROBJ and opaque handles + +A common interface arrangement for C libraries is that some sort of +I function creates and returns a handle, which is a pointer to +some opaque data. Other function calls are then passed that handle as an +argument, until finally some sort of destroy function frees the handle and +its data. The C typemap is one common method for mapping Perl +objects to such C library handles. Behind the scenes, it uses blessed +scalar objects with the scalar's integer value set to the address of the +handle. The C code template of the C typemap retrieves the +pointer from the scalar object referred to by a passed RV argument, while +the C template creates a new blessed RV-to-SV with the handle +address stored in it. + +For the purposes of an example, we'll create here a minimal example C +library called C, which we'll then proceed to wrap using XS. This +library just stores an integer in its opaque data. In real life you would +be wrapping an existing library which stores something more interesting, +such as a complex number or a multiple precision integer. + +The following sample library code might go in the initial 'C' part of the +XS file: + + typedef struct { int i; } mynum; + + mynum* mynum_new(int i) + { + mynum* x = (mynum*)malloc(sizeof(mynum)); + x->i = i; + return x; + } - FALLBACK: TRUE - ... + void mynum_destroy (mynum *x) + { free((void*)x); } -where FALLBACK can take any of the three values TRUE, FALSE, or -UNDEF. If you do not set any FALLBACK value when using OVERLOAD, -it defaults to UNDEF. FALLBACK is not used except when one or -more functions using OVERLOAD have been defined. Please see -L for more details. + int mynum_val (mynum *x) + { return x->i; } -=head2 The INTERFACE: Keyword + mynum* mynum_add (mynum *x, mynum *y) + { return mynum_new(x->i + y->i); } -This keyword declares the current XSUB as a keeper of the given -calling signature. If some text follows this keyword, it is -considered as a list of functions which have this signature, and -should be attached to the current XSUB. + mynum* mynum_subtract (mynum *x, mynum *y) + { return mynum_new(x->i - y->i); } -For example, if you have 4 C functions multiply(), divide(), add(), -subtract() all having the signature: + mynum* mynum_multiply (mynum *x, mynum *y) + { return mynum_new(x->i * y->i); } - symbolic f(symbolic, symbolic); + mynum* mynum_divide (mynum *x, mynum *y) + { return mynum_new(x->i / y->i); } -you can make them all to use the same XSUB using this: +The C struct holds the opaque handle data. The C +function creates a numeric value and returns a handle to it. The other +functions then take such handles as arguments, including a destroy +function to free a handle's data. - symbolic - interface_s_ss(arg1, arg2) - symbolic arg1 - symbolic arg2 - INTERFACE: - multiply divide - add subtract +The following XS code shows an example of how this library might be +wrapped and be made accessible from Perl via C objects: -(This is the complete XSUB code for 4 Perl functions!) Four generated -Perl function share names with corresponding C functions. + typedef mynum *My__Num; -The advantage of this approach comparing to ALIAS: keyword is that there -is no need to code a switch statement, each Perl function (which shares -the same XSUB) knows which C function it should call. Additionally, one -can attach an extra function remainder() at runtime by using + MODULE = My::Num PACKAGE = My::Num PREFIX = mynum_ - CV *mycv = newXSproto("Symbolic::remainder", - XS_Symbolic_interface_s_ss, __FILE__, "$$"); - XSINTERFACE_FUNC_SET(mycv, remainder); + PROTOTYPES: DISABLE -say, from another XSUB. (This example supposes that there was no -INTERFACE_MACRO: section, otherwise one needs to use something else instead of -C, see the next section.) + TYPEMAP: <, -and C for this C. The setter macro is given cv, -and the function pointer. + void + DESTROY(My::Num x) + CODE: + mynum_destroy(x); -The default value is C and C. -An INTERFACE keyword with an empty list of functions can be omitted if -INTERFACE_MACRO keyword is used. + int + mynum_val(My::Num x) -Suppose that in the previous example functions pointers for -multiply(), divide(), add(), subtract() are kept in a global C array -C with offsets being C, C, C, -C. Then one can use + My::Num + mynum_add(My::Num x, My::Num y) + ALIAS: subtract = 1 + multiply = 2 + divide = 3 + CODE: + switch (ix) { + case 0: RETVAL = mynum_add(x, y); break; + case 1: RETVAL = mynum_subtract(x, y); break; + case 2: RETVAL = mynum_multiply(x, y); break; + case 3: RETVAL = mynum_divide(x, y); break; + } + OUTPUT: + RETVAL - #define XSINTERFACE_FUNC_BYOFFSET(ret,cv,f) \ - ((XSINTERFACE_CVT_ANON(ret))fp[CvXSUBANY(cv).any_i32]) - #define XSINTERFACE_FUNC_BYOFFSET_set(cv,f) \ - CvXSUBANY(cv).any_i32 = CAT2( f, _off ) +The XSUBs in this example are mostly declared with parameter and return +types of C which, as explained in L, is looked up as-is in the typemap, but has C +applied to the type name to convert it to the C C type when used +in the declaration of the XSUB's auto variables. -in C section, +Going through this code in order: while still in the 'C' half of the XS +file, we add a typedef which says that the C C type is equivalent +to a pointer to a handle from that arithmetic library. - symbolic - interface_s_ss(arg1, arg2) - symbolic arg1 - symbolic arg2 - INTERFACE_MACRO: - XSINTERFACE_FUNC_BYOFFSET - XSINTERFACE_FUNC_BYOFFSET_set - INTERFACE: - multiply divide - add subtract +Next, the C line includes a C prefix, which means that +the names of the XSUBs in the Perl namespace will be C etc +rather than C. -in XSUB section. +Then a C declaration is used to map the C pseudo-type to +the C XS type. -=head2 The INCLUDE: Keyword +Next comes the C class method. This will be called from perl as +C<< My::Num->new(99); >> for example. Its first parameter will be the +class name, which we don't use here, and the second parameter is the value +to initialise the object to. The XSUB autocalls the library C +function with just the C value. This returns a handle, which the +C C map converts into a blessed scalar ref containing +the handle. -This keyword can be used to pull other files into the XS module. The other -files may have XS code. INCLUDE: can also be used to run a command to -generate the XS code to be pulled into the module. +Next, the C method is just a thin wrapper around +C, while C returns the integer value of the +object. -The file F contains our C function: +Finally, four binary functions are defined, sharing the same XSUB body via +aliases. As an alternative, the code for the main XSUB could simplified +using the L keyword rather than +using aliasing: - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep + My::Num + arithmetic_interface(My::Num x, My::Num y) + INTERFACE: + mynum_add + mynum_subtract + mynum_multiply + mynum_divide -The XS module can use INCLUDE: to pull that file into it. +but note that C only supports Perlish return types such +as C from Perl 5.44.0 (F 3.60) onwards. - INCLUDE: Rpcb1.xsh +This XS module might be accessed from Perl using code like this: -If the parameters to the INCLUDE: keyword are followed by a pipe (C<|>) then -the compiler will interpret the parameters as a command. This feature is -mildly deprecated in favour of the C directive, as documented -below. + use My::Num; - INCLUDE: cat Rpcb1.xsh | - -Do not use this to run perl: C will run the perl that -happens to be the first in your path and not necessarily the same perl that is -used to run C. See L<"The INCLUDE_COMMAND: Keyword">. - -=head2 The INCLUDE_COMMAND: Keyword - -Runs the supplied command and includes its output into the current XS -document. C assigns special meaning to the C<$^X> token -in that it runs the same perl interpreter that is running C: - - INCLUDE_COMMAND: cat Rpcb1.xsh - - INCLUDE_COMMAND: $^X -e ... - -=head2 The CASE: Keyword - -The CASE: keyword allows an XSUB to have multiple distinct parts with each -part acting as a virtual XSUB. CASE: is greedy and if it is used then all -other XS keywords must be contained within a CASE:. This means nothing may -precede the first CASE: in the XSUB and anything following the last CASE: is -included in that case. - -A CASE: might switch via a parameter of the XSUB, via the C ALIAS: -variable (see L<"The ALIAS: Keyword">), or maybe via the C variable -(see L<"Variable-length Parameter Lists">). The last CASE: becomes the -B case if it is not associated with a conditional. The following -example shows CASE switched via C with a function C -having an alias C. When the function is called as -C its parameters are the usual C<(char *host, time_t *timep)>, -but when the function is called as C its parameters are -reversed, C<(time_t *timep, char *host)>. - - long - rpcb_gettime(a,b) - CASE: ix == 1 - ALIAS: - x_gettime = 1 - INPUT: - # 'a' is timep, 'b' is host - char *b - time_t a = NO_INIT - CODE: - RETVAL = rpcb_gettime( b, &a ); - OUTPUT: - a - RETVAL - CASE: - # 'a' is host, 'b' is timep - char *a - time_t &b = NO_INIT - OUTPUT: - b - RETVAL + my $i2 = My::Num->new(2); + my $i7 = My::Num->new(7); + my $i13 = My::Num->new(13); -That function can be called with either of the following statements. Note -the different argument lists. + my $x = $i13->add($i7)->divide($i2); + printf "val=%d\n", $x->val(); # prints "val=10" - $status = rpcb_gettime( $host, $timep ); +See L for an example of how to extend this using +overloading so that the expression could be written more simply as +C<($i13 + $i7)/$i2>. - $status = x_gettime( $timep, $host ); +Note that, as a very special case, the XS compiler translates the XS +typemap name using C when looking up INPUT typemap entries +for an XSUB named C. So for such subs, the C typemap +entry will be used instead. -=head2 The EXPORT_XSUB_SYMBOLS: Keyword +=head2 Using XS With C++ -The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. -In perl versions earlier than 5.16.0, this keyword does nothing. Starting -with 5.16, XSUB symbols are no longer exported by default. That is, they -are C functions. If you include + MODULE = Foo::Bar PACKAGE = Foo::Bar - EXPORT_XSUB_SYMBOLS: ENABLE + # Class methods -in your XS code, the XSUBs following this line will not be declared C. -You can later disable this with + int + X::Y::new(int i) - EXPORT_XSUB_SYMBOLS: DISABLE + static int + X::Y::foo(int i) -which, again, is the default that you should probably never change. -You cannot use this keyword on versions of perl before 5.16 to make -XSUBs C. + # Object methods -=head2 The & Unary Operator + int + X::Y::bar(int i) -The C<&> unary operator in the INPUT: section is used to tell B -that it should convert a Perl value to/from C using the C type to the left -of C<&>, but provide a pointer to this value when the C function is called. + int + X::Y::bar2(int i) const -This is useful to avoid a CODE: block for a C function which takes a parameter -by reference. Typically, the parameter should be not a pointer type (an -C or C but not an C or C). + void + X::Y::DESTROY() -The following XSUB will generate incorrect C code. The B compiler will -turn this into code which calls C with parameters C<(char -*host, time_t timep)>, but the real C wants the C -parameter to be of type C rather than C. + # C-linkage function - bool_t - rpcb_gettime(host,timep) - char *host - time_t timep - OUTPUT: - timep + extern "C" int + baz(int i) -That problem is corrected by using the C<&> operator. The B compiler -will now turn this into code which calls C correctly with -parameters C<(char *host, time_t *timep)>. It does this by carrying the -C<&> through, so the function call looks like C. +XS provides limited support for generating C++ (as opposed to C) output +files. Any XSUB whose name includes C<::> is treated as a C++ method. This +triggers two main changes in the way the XSUB's code is generated: - bool_t - rpcb_gettime(host,timep) - char *host - time_t &timep - OUTPUT: - timep - -=head2 Inserting POD, Comments and C Preprocessor Directives - -C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:, -PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions. -Comments are allowed anywhere after the MODULE keyword. The compiler will -pass the preprocessor directives through untouched and will remove the -commented lines. POD documentation is allowed at any point, both in the -C and XS language sections. POD must be terminated with a C<=cut> command; -C will exit with an error if it does not. It is very unlikely that -human generated C code will be mistaken for POD, as most indenting styles -result in whitespace in front of any line starting with C<=>. Machine -generated XS files may fall into this trap unless care is taken to -ensure that a space breaks the sequence "\n=". - -Comments can be added to XSUBs by placing a C<#> as the first -non-whitespace of a line. Care should be taken to avoid making the -comment look like a C preprocessor directive, lest it be interpreted as -such. The simplest way to prevent this is to put whitespace in front of -the C<#>. - -If you use preprocessor directives to choose one of two -versions of a function, use - - #if ... version1 - #else /* ... version2 */ - #endif +=over -and not +=item * - #if ... version1 - #endif - #if ... version2 - #endif +An implicit first argument is added. For class methods, this will be +called C and will be of type C. For object methods, it will +be called C and be of type C (where C is the prefix +of the XSUB's name). XSUBs are treated as class methods if their name is +C or their return type has the C prefix. -because otherwise B will believe that you made a duplicate -definition of the function. Also, put a blank line before the -#else/#endif so it will not be seen as part of the function body. +=item * -=head2 Using XS With C++ +Any autocall will generate an appropriate C++ method call rather then a C +function call. In particular, based on the examples above: -If an XSUB name contains C<::>, it is considered to be a C++ method. -The generated Perl function will assume that -its first argument is an object pointer. The object pointer -will be stored in a variable called THIS. The object should -have been created by C++ with the new() function and should -be blessed by Perl with the sv_setref_pv() macro. The -blessing of the object by Perl can be handled by a typemap. An example -typemap is shown at the end of this section. + new: RETVAL = new X::Y(i); + static foo: RETVAL = X::Y::foo(i); + bar (and bar2): RETVAL = THIS->bar(i); + DESTROY: delete THIS; -If the return type of the XSUB includes C, the method is considered -to be a static method. It will call the C++ -function using the class::method() syntax. If the method is not static -the function will be called using the THIS-Emethod() syntax. +=item * -The next examples will use the following C++ class. +In addition, if the XSUB declaration has a trailing C, then the +type of C will be declared as C. - class color { - public: - color(); - ~color(); - int blue(); - void set_blue( int ); +=back - private: - int c_blue; - }; +This is mostly just syntactic sugar. The C XSUB declaration above +could be written longhand as: -The XSUBs for the blue() and set_blue() methods are defined with the class -name but the parameter for the object (THIS, or "self") is implicit and is -not listed. + int + bar(X::Y* THIS, int i) + CODE: + RETVAL = THIS->foo(i); + OUTPUT: + RETVAL - int - color::blue() +Note that the type of C (and, since Perl 5.42, C) can be +overridden with a line in an C section: - void - color::set_blue( val ) - int val + int + X::Y::bar(int i) + X::Y::Z *THIS -Both Perl functions will expect an object as the first parameter. In the -generated C++ code the object is called C, and the method call will -be performed on this object. So in the C++ code the blue() and set_blue() -methods will be called as this: +Finally, a plain C XSUB declaration can be prefixed with C to +give that XSUB C linkage. - RETVAL = THIS->blue(); +Some of the methods above might be called from Perl using code like this: - THIS->set_blue( val ); + { + my $obj = Foo::Bar->new(1); + $obj->bar(2); + # implicit $obj->DESTROY(); + } -You could also write a single get/set method using an optional argument: +This example uses C rather than C to emphasise that the +name of the C++ class needn't follow the Perl package name. - int - color::blue( val = NO_INIT ) - int val - PROTOTYPE $;$ - CODE: - if (items > 1) - THIS->set_blue( val ); - RETVAL = THIS->blue(); - OUTPUT: - RETVAL +The call to C will pass the string C<"Foo::Bar"> as the first +argument, which can be used to allow multiple Perl classes to share the +same C method. In the simple worked example below, the package name +is hard-coded and that parameter is unused. The C method is +expected to return a Perl object which in some way has a pointer to the +underlying C++ object embedded within it. This is similar to the +L example of wrapping a C library which +uses a handle, although with a subtle difference, as explained below. -If the function's name is B then the C++ C function will be -called and C will be given as its parameter. The generated C++ code for +Calling C passes this Perl object as the first argument, which the +typemap will use to extract the C++ object pointer and assign to the +C auto variable. - void - color::DESTROY() +=head3 A complete C++ example -will look like this: +First, you need to tell MakeMaker or similar that the generated file +should be compiled using a C++ compiler. For basic experimentation you may +be able to get by with just adding these two lines to the +C method call in F: - color *THIS = ...; // Initialized as in typemap + CC => 'c++', + LD => '$(CC)', - delete THIS; +but for portability in production use, you may want to use something like +L to automatically generate the correct options for +L or L based on which C++ compiler +is available. -If the function's name is B then the C++ C function will be called -to create a dynamic C++ object. The XSUB will expect the class name, which -will be kept in a variable called C, to be given as the first -argument. +Then create a C<.xs> file like this: - color * - color::new() + #define PERL_NO_GET_CONTEXT -The generated C++ code will call C. + #include "EXTERN.h" + #include "perl.h" + #include "XSUB.h" + #include "ppport.h" + + namespace Paint { + class color { + int c_R; + int c_G; + int c_B; + public: + color(int r, int g, int b) { c_R = r; c_G = g; c_B = b; } + ~color() { printf("destructor called\n"); } + int blue() { return c_B; } + void set_blue(int b) { c_B = b; }; + // and similar for red, green + }; + } - RETVAL = new color(); + typedef Paint::color Paint__color; -The following is an example of a typemap that could be used for this C++ -example. + MODULE = Foo::Bar PACKAGE = Foo::Bar - TYPEMAP - color * O_OBJECT + PROTOTYPES: DISABLE - OUTPUT - # The Perl object is blessed into 'CLASS', which should be a - # char* having the name of the package for the blessing. - O_OBJECT - sv_setref_pv( $arg, CLASS, (void*)$var ); + TYPEMAP: <) is often sufficient. -However, sometimes the interface will look -very C-like and occasionally nonintuitive, especially when the C function -modifies one of its parameters, or returns failure inband (as in "negative -return values mean failure"). In cases where the programmer wishes to -create a more Perl-like interface the following strategy may help to -identify the more critical parts of the interface. - -Identify the C functions with input/output or output parameters. The XSUBs for -these functions may be able to return lists to Perl. - -Identify the C functions which use some inband info as an indication -of failure. They may be -candidates to return undef or an empty list in case of failure. If the -failure may be detected without a call to the C function, you may want to use -an INIT: section to report the failure. For failures detectable after the C -function returns one may want to use a POSTCALL: section to process the -failure. In more complicated cases use CODE: or PPCODE: sections. - -If many functions use the same failure indication based on the return value, -you may want to create a special typedef to handle this situation. Put - - typedef int negative_is_failure; - -near the beginning of XS file, and create an OUTPUT typemap entry -for C which converts negative values to C, or -maybe croak()s. After this the return value of type C -will create more Perl-like interface. - -Identify which values are used by only the C and XSUB functions -themselves, say, when a parameter to a function should be a contents of a -global variable. If Perl does not need to access the contents of the value -then it may not be necessary to provide a translation for that value -from C to Perl. - -Identify the pointers in the C function parameter lists and return -values. Some pointers may be used to implement input/output or -output parameters, they can be handled in XS with the C<&> unary operator, -and, possibly, using the NO_INIT keyword. -Some others will require handling of types like C, and one needs -to decide what a useful Perl translation will do in such a case. When -the semantic is clear, it is advisable to put the translation into a typemap -file. - -Identify the structures used by the C functions. In many -cases it may be helpful to use the T_PTROBJ typemap for -these structures so they can be manipulated by Perl as -blessed objects. (This is handled automatically by C.) - -If the same C type is used in several different contexts which require -different translations, C several new types mapped to this C type, -and create separate F entries for these new types. Use these -types in declarations of return type and parameters to XSUBs. - -=head2 Perl Objects And C Structures - -When dealing with C structures one should select either -B or B for the XS type. Both types are -designed to handle pointers to complex objects. The -T_PTRREF type will allow the Perl object to be unblessed -while the T_PTROBJ type requires that the object be blessed. -By using T_PTROBJ one can achieve a form of type-checking -because the XSUB will attempt to verify that the Perl object -is of the expected type. - -The following XS code shows the getnetconfigent() function which is used -with ONC+ TIRPC. The getnetconfigent() function will return a pointer to a -C structure and has the C prototype shown below. The example will -demonstrate how the C pointer will become a Perl reference. Perl will -consider this reference to be a pointer to a blessed object and will -attempt to call a destructor for the object. A destructor will be -provided in the XS source to free the memory used by getnetconfigent(). -Destructors in XS can be created by specifying an XSUB function whose name -ends with the word B. XS destructors can be used to free memory -which may have been malloc'd by another XSUB. - - struct netconfig *getnetconfigent(const char *netid); - -A C will be created for C. The Perl -object will be blessed in a class matching the name of the C -type, with the tag C appended, and the name should not -have embedded spaces if it will be a Perl package name. The -destructor will be placed in a class corresponding to the -class of the object and the PREFIX keyword will be used to -trim the name to the word DESTROY as Perl will expect. - - typedef struct netconfig Netconfig; - - MODULE = RPC PACKAGE = RPC - - Netconfig * - getnetconfigent(netid) - char *netid - - MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ - - void - rpcb_DESTROY(netconf) - Netconfig *netconf - CODE: - printf("Now in NetconfigPtr::DESTROY\n"); - free( netconf ); + OUTPUT + T_PKG_OBJ + sv_setref_pv($arg, "$Package", (void*)$var); -This example requires the following typemap entry. Consult -L for more information about adding new typemaps -for an extension. + EOF - TYPEMAP - Netconfig * T_PTROBJ + Paint::color * + Paint::color::new(int r, int g, int b) -This example will be used with the following Perl statements. + int + Paint::color::blue() - use RPC; - $netconf = getnetconfigent("udp"); + void + Paint::color::set_blue(int b) -When Perl destroys the object referenced by $netconf it will send the -object to the supplied XSUB DESTROY function. Perl cannot determine, and -does not care, that this object is a C struct and not a Perl object. In -this sense, there is no difference between the object created by the -getnetconfigent() XSUB and an object created by a normal Perl subroutine. + void + Paint::color::DESTROY() + +In the C part of the XS file (or this case, the C++ part), a trivial +example C++ class is defined. This would more typically be a pre-existing +library with just the appropriate C<#include>. The example includes a +namespace to make it clearer when something is a namespace, class name or +Perl package. The Perl package is called C rather than +C to again distinguish it. You could however call the Perl +package C if you desired. + +A single typedef follows to allow for XS-mangled class names, as explained +in L. + +Then the C line starts the XS part of the file. + +Then there follows a full definition of a new typemap called C. +This is actually a direct copy of the C typemap found in the +system typemap file, except that all occurrences of C<$ntype> have been +replaced with C<$Package>. It serves the same basic purpose as +C: embedding a pointer within a new blessed Perl object, +and later, retrieving that pointer from the object. The difference is in +terms of what package the object is blessed into. C expects the +type name (C) to already be a pointer type, but with a C++ +XSUB, the implicit C argument is automatically declared to be of +type C (so C itself isn't necessarily a +pointer type). In addition, when the Perl and C++ class names differ we +want the object to be blessed using the Perl package name, not the C++ +class name. In this example, the actual values of the two variables when +the typemap template is being evalled, are: + + $ntype = "Paint::colorPtr"; + $Package = "Foo::Bar"; + +The typemap also includes an INPUT definition for C, which is +an I copy of C. This is needed because, as an +optimisation, the XS parser automatically renames an INPUT typemap using +C if the name of the XSUB is C, on the grounds that +it's not necessary to to check that the object is the right class. + +Finally the XS file includes a few XSUBs which are wrappers around the +class's methods. + +This class might be used like this: + + use Foo::Bar; + + my $color = Foo::Bar->new(0x10, 0x20, 0xff); + printf "blue=%d\n", $color->blue(); # prints 255 + $color->set_blue(0x80); + printf "blue=%d\n", $color->blue(); # prints 128 =head2 Safely Storing Static Data in XS -Starting with Perl 5.8, a macro framework has been defined to allow -static data to be safely stored in XS modules that will be accessed from -a multi-threaded Perl. +You should generally avoid declaring static variables and data within an +XS file. The Perl interpreter binary is commonly configured to allow +multiple interpreter structures, with a complete set of interpreter state +per interpreter struct. In this case, you usually need your "static" data +to be per-interpreter rather than a single shared per-process value. + +This becomes more important in the presence of multiple threads; either +via C or where the Perl interpreter is embedded within +another application (such as a web server) which may manage its own +threads and allocate interpreters to threads as it sees fit. + +A macro framework is available to XS code to allow a single C struct to be +declared and safely accessed. Behind the scenes, the struct will be +allocated per interpreter or thread; on non-threaded Perl interpreter +builds, the macros gracefully degrade to a single global instance. These +macros have C ("my context") as part of their names. -Although primarily designed for use with multi-threaded Perl, the macros -have been designed so that they will work with non-threaded Perl as well. +It is therefore strongly recommended that these macros be used by all XS +modules that make use of static data. -It is therefore strongly recommended that these macros be used by all -XS modules that make use of static data. +When creating a new skeleton F file, you can use the C<--global> +option of F to also include a skeleton set of macros, e.g. -The easiest way to get a template set of macros to use is by specifying -the C<-g> (C<--global>) option with h2xs (see L). + h2xs -A --global -n Foo::Bar -Below is an example module that makes use of the macros. +Below is a complete example module that makes use of the macros. It tracks +the names of up to three blind mice. #define PERL_NO_GET_CONTEXT #include "EXTERN.h" #include "perl.h" #include "XSUB.h" + #include "ppport.h" + + #define MAX_NAME_LEN 100 /* Global Data */ @@ -1973,93 +4578,129 @@ Below is an example module that makes use of the macros. typedef struct { int count; - char name[3][100]; + char name[3][MAX_NAME_LEN+1]; } my_cxt_t; START_MY_CXT - MODULE = BlindMice PACKAGE = BlindMice + MODULE = BlindMice PACKAGE = BlindMice + + PROTOTYPES: DISABLE BOOT: { MY_CXT_INIT; MY_CXT.count = 0; - strcpy(MY_CXT.name[0], "None"); - strcpy(MY_CXT.name[1], "None"); - strcpy(MY_CXT.name[2], "None"); } int - newMouse(char * name) - PREINIT: - dMY_CXT; - CODE: - if (MY_CXT.count >= 3) { - warn("Already have 3 blind mice"); - RETVAL = 0; - } - else { - RETVAL = ++ MY_CXT.count; - strcpy(MY_CXT.name[MY_CXT.count - 1], name); - } - OUTPUT: - RETVAL + AddMouse(char *name) + PREINIT: + dMY_CXT; + CODE: + if (strlen(name) > MAX_NAME_LEN) + croak("Mouse name too long\n"); + + if (MY_CXT.count >= 3) { + warn("Already have 3 blind mice"); + RETVAL = 0; + } + else { + RETVAL = ++MY_CXT.count; + strcpy(MY_CXT.name[MY_CXT.count - 1], name); + } + OUTPUT: + RETVAL char * - get_mouse_name(index) - int index - PREINIT: - dMY_CXT; - CODE: - if (index > MY_CXT.count) - croak("There are only 3 blind mice."); - else + get_mouse_name(int index) + PREINIT: + dMY_CXT; + CODE: + if (index > MY_CXT.count) + croak("There are only %d blind mice.", MY_CXT.count); + else RETVAL = MY_CXT.name[index - 1]; - OUTPUT: - RETVAL + OUTPUT: + RETVAL void CLONE(...) - CODE: - MY_CXT_CLONE; + CODE: + MY_CXT_CLONE; + +The main points from this example are: + +=over + +=item * + +The C struct will hold all your "static" data. + +=item * + +The C and C are boilerplate to make the macro +system work. The former is a string which should be unique to your module. + +=item * -=head3 MY_CXT REFERENCE +The C in the C section allocates the struct when the +module is loaded. You can add further boot code which does any +initialisation you require (such as setting C). C is called +at most once per interpreter, when code in that interpreter instance first +does C. + +=item * + +Each XSUB includes a C declaration, which retrieves a pointer to +the struct associated with the current interpreter and saves it in a +hidden auto variable; C allows you to access fields within this +structure. + +=item * + +C creates a byte-for-byte copy of the current struct. This +is called from the special C XSUB, to ensure that each new thread +gets its own copy of the data which is otherwise shared by default. + +=back + +=head3 MY_CXT macros reference =over 5 =item MY_CXT_KEY -This macro is used to define a unique key to refer to the static data -for an XS module. The suggested naming scheme, as used by h2xs, is to -use a string that consists of the module name, the string "::_guts" -and the module version number. +This macro is used to define a unique key to refer to the static data for +an XS module. The suggested naming scheme, as used by F, is to use a +string that consists of a concatenation of the module name, the string +C<::_guts> and the module version number: #define MY_CXT_KEY "MyModule::_guts" XS_VERSION -=item typedef my_cxt_t - -This struct typedef I always be called C. The other -C macros assume the existence of the C typedef name. +=item my_cxt_t -Declare a typedef named C that is a structure that contains -all the data that needs to be interpreter-local. +The "static" values should be stored within a struct typedef which I +always be called C. The other C<*MY_CXT*> macros assume the +existence of the C typedef name. For example: typedef struct { int some_value; + int some_other_value; } my_cxt_t; =item START_MY_CXT -Always place the START_MY_CXT macro directly after the declaration -of C. +This macro contains hidden boilerplate code. Always place the +C macro directly after the declaration of C. =for apidoc Amnh||START_MY_CXT =item MY_CXT_INIT -The MY_CXT_INIT macro initializes storage for the C struct. +The C macro initializes storage for the C struct. -It I be called exactly once, typically in a BOOT: section. If you +It must be called I, typically in a BOOT: section. If you are maintaining multiple interpreters, it should be called once in each interpreter instance, except for interpreters cloned from existing ones. (But see L below.) @@ -2068,21 +4709,21 @@ interpreter instance, except for interpreters cloned from existing ones. =item dMY_CXT -Use the dMY_CXT macro (a declaration) in all the functions that access -MY_CXT. +Use the C macro (a declaration) at the start of all the XSUBs +(and other functions) that access C. =for apidoc Amnh||dMY_CXT =item MY_CXT -Use the MY_CXT macro to access members of the C struct. For +Use the C macro to access members of the C struct. For example, if C is typedef struct { int index; } my_cxt_t; -then use this to access the C member +then use this to access the C member: dMY_CXT; MY_CXT.index = 2; @@ -2091,7 +4732,8 @@ then use this to access the C member C may be quite expensive to calculate, and to avoid the overhead of invoking it in each function it is possible to pass the declaration -onto other functions using the C/C macros, eg +onto other functions using the argument/parameter C/C +macros, e.g.: =for apidoc Amnh||_aMY_CXT =for apidoc Amnh||aMY_CXT @@ -2102,26 +4744,33 @@ onto other functions using the C/C macros, eg =for apidoc Amnh||MY_CXT void sub1() { - dMY_CXT; - MY_CXT.index = 1; - sub2(aMY_CXT); + dMY_CXT; + MY_CXT.index = 1; + sub2(aMY_CXT); } void sub2(pMY_CXT) { - MY_CXT.index = 2; + MY_CXT.index = 2; } -Analogously to C, there are equivalent forms for when the macro is the -first or last in multiple arguments, where an underscore represents a -comma, i.e. C<_aMY_CXT>, C, C<_pMY_CXT> and C. +Analogously to C, there are equivalent forms for when the macro is +the first or last in multiple arguments, where an underscore is expanded +to a comma where appropriate, i.e. C<_aMY_CXT>, C, C<_pMY_CXT> +and C. These allow for the possibility that those macros might +optimise away any actual argument without leaving a stray comma. =item MY_CXT_CLONE -By default, when a new interpreter is created as a copy of an existing one -(eg via C<< threads->create() >>), both interpreters share the same physical -my_cxt_t structure. Calling C (typically via the package's -C function), causes a byte-for-byte copy of the structure to be -taken, and any future dMY_CXT will cause the copy to be accessed instead. +When a new interpreter is created as a copy of an existing one (e.g. via +C<< threads->create() >>), then by default, both interpreters share the +same physical my_cxt_t structure. Calling C (typically via +the package's C function), causes a byte-for-byte copy of the +structure to be taken (but not a deep copy) and any future C will +cause the copy to be accessed instead. + +This is typically used within the C method which is called each +time an interpreter is copied (usually when creating a new thread). Other +code can be added to C to deep copy items within the structure. =for apidoc Amnh||MY_CXT_CLONE @@ -2129,92 +4778,62 @@ taken, and any future dMY_CXT will cause the copy to be accessed instead. =item dMY_CXT_INTERP(my_perl) -These are versions of the macros which take an explicit interpreter as an -argument. +These are variants of the C and C macros which take +an explicit perl interpreter as an argument. =back Note that these macros will only work together within the I source -file; that is, a dMY_CTX in one source file will access a different structure -than a dMY_CTX in another source file. +file; that is, a C in one source file will access a different +structure than a C in another source file. =head1 EXAMPLES -File C: Interface to some ONC+ RPC bind library functions. - - #define PERL_NO_GET_CONTEXT - #include "EXTERN.h" - #include "perl.h" - #include "XSUB.h" - - /* Note: On glibc 2.13 and earlier, this needs be */ - #include - - typedef struct netconfig Netconfig; +Fairly complete examples of XS files can be found elsewhere in this +document: - MODULE = RPC PACKAGE = RPC +=over - SV * - rpcb_gettime(host="localhost") - char *host - PREINIT: - time_t timep; - CODE: - ST(0) = sv_newmortal(); - if( rpcb_gettime( host, &timep ) ) - sv_setnv( ST(0), (double)timep ); - - Netconfig * - getnetconfigent(netid="udp") - char *netid +=item * - MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_ +L - void - rpcb_DESTROY(netconf) - Netconfig *netconf - CODE: - printf("NetconfigPtr::DESTROY\n"); - free( netconf ); +=item * -File C: Custom typemap for RPC.xs. (cf. L) +L - TYPEMAP - Netconfig * T_PTROBJ +=item * -File C: Perl module for the RPC extension. +L - package RPC; +=back - require Exporter; - require DynaLoader; - @ISA = qw(Exporter DynaLoader); - @EXPORT = qw(rpcb_gettime getnetconfigent); +while L contains an overview of an XS file and L +contains various worked examples. - bootstrap RPC; - 1; +You can of course look at existing XS distributions on CPAN for +inspiration, although bear in mind that many of these will have been +created before this document was rewritten in 2025, and so may not follow +current best practices. -File C: Perl test program for the RPC extension. +Note that when wrapping a real library, you'll often need to add a line +like this to the .xs file: - use RPC; + #include - $netconf = getnetconfigent(); - $a = rpcb_gettime(); - print "time = $a\n"; - print "netconf = $netconf\n"; +and add entries like: - $netconf = getnetconfigent("tcp"); - $a = rpcb_gettime("poplar"); - print "time = $a\n"; - print "netconf = $netconf\n"; + LIBS => ['-lfoo', '-lbar'], -In Makefile.PL add -ltirpc and -I/usr/include/tirpc. +to F or similar. And don't forget to add test scripts under +t/. =head1 CAVEATS =head2 Use of standard C library functions -See L. +Often, the Perl API contains functions which you should use I of +the standard C library ones. See L. =head2 Event loops and control flow @@ -2235,17 +4854,20 @@ This document covers features supported by C =head1 AUTHOR DIAGNOSTICS -As of version 3.49 certain warnings are disabled by default. While developing -you can set C<$ENV{AUTHOR_WARNINGS}> to true in your environment or in your -Makefile.PL, or set C<$ExtUtils::ParseXS::AUTHOR_WARNINGS> to true via code, or -pass C<< author_warnings=>1 >> into process_file() explicitly. Currently this will -enable stricter alias checking but more warnings might be added in the future. -The kind of warnings this will enable are only helpful to the author of the XS -file, and the diagnostics produced will not include installation specific -details so they are only useful to the maintainer of the XS code itself. +As of version 3.49 a few parser warnings are disabled by default. While +developing you can set C<$ENV{AUTHOR_WARNINGS}> to true in your +environment or in your Makefile.PL, or set +C<$ExtUtils::ParseXS::AUTHOR_WARNINGS> to true via code, or pass C<< +author_warnings=>1 >> into process_file() explicitly. Currently this will +enable stricter alias checking but more warnings might be added in the +future. The kind of warnings this will enable are only helpful to the +author of the XS file, and the diagnostics produced will not include +installation specific details so they are only useful to the maintainer of +the XS code itself. =head1 AUTHOR Originally written by Dean Roehrich >. +Completely rewritten in 2025. -Maintained since 1996 by The Perl Porters >. +Maintained since 1996 by The Perl Porters, >. diff --git a/t/porting/known_pod_issues.dat b/t/porting/known_pod_issues.dat index 8213094a3b05..995e79f873a9 100644 --- a/t/porting/known_pod_issues.dat +++ b/t/porting/known_pod_issues.dat @@ -123,6 +123,7 @@ exit(3) Expect Exporter::Easy ExtUtils::Constant::ProxySubs +ExtUtils::CppGuess fchdir(2) fchmod(2) fchown(2)