The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Net::Gemini - a small gemini client

SYNOPSIS

  use Net::Gemini;
  my ($gem, $code) = Net::Gemini->get('gemini://example.org/');

  use Syntax::Keyword::Match;
  match($code : ==) {
    case(0) { die "request failed " . $gem->error }
    case(1) { ... $gem->meta as prompt for input ... }
    case(2) { ... $gem->meta and collect on the socket ... }
    case(3) { ... $gem->meta as redirect ... }
    case(4) { ... $gem->meta as temporary failure ... }
    case(5) { ... $gem->meta as permanent failure ... }
    case(6) { ... $gem->meta as client certificate message ... }
  }

DESCRIPTION

This module implements code that may help implement a gemini protocol client in Perl.

CAVEATS

It's a pretty beta module.

The default SSL verification is more or less to accept the connection; this is perhaps not ideal. The caller will need to implement TOFU or a similar means of verifying the other end.

gemini://makeworld.space/gemlog/2020-07-03-tofu-rec.gmi

FUNCTION

gemini_request URI [ options ... ]

A utility function that is not exported by default; it calls the get method and handles redirects and the collection of content, if any.

  use Net::Gemini 'gemini_request';
  my ( $gem, $code ) = gemini_request( 'gemini://...', ... );

A code 2 will result in the mime accessor being populated with the Content-Type by way of the parse_mime_type function from Parse::MIME.

A notable difference here is that a code of 3 indicates that too many redirects were encountered, not that there was a redirect. This should be considered an error.

The socket will be closed when this call ends.

Options include:

bufsize => strictly-positive-integer

Buffer size to use for reads from the socket. 4096 by default.

content_callback => code-reference

Custom callback to handle one or more portions of the request content with, same as the getmore interface. If a callback is not provided the content will be collected into the object via the content accessor. The callback is given the current buffer (raw), the length of that buffer, and a reference to the gemini object.

  gemini_request( $uri, content_callback => sub {
    my ( $buffer, $length, $gem ) = @_;
    ...
    return 1;
  });

Processing will stop if the callback returns a false value.

max_size => strictly-positive-integer

Maximum content size to collect into content. Ignored if a custom callback is provided. The code will be zero and the error will be max_size and the status will start with 2 and the content will be truncated if the response is larger than permitted.

max_redirects => strictly-positive-integer

How many redirections should be followed. 5 is the default.

redirect_delay => floating-point

How long to delay between redirects, by default 1 second. There is a delay by default because gemini servers or firewalls may rate limit requests, or the gemini server simply may not have much CPU available.

param => hash-reference

Parameters that will be passed to the get method.

METHODS

get URI [ parameters ... ]

Tries to obtain the given gemini URI.

Returns an object and a result code. The socket is set to use the :raw binmode. The result code will be 0 if there was a problem with the request--that the URI failed to parse, or the connection failed--or otherwise a gemini code in the range of 1 to 6 inclusive, which will indicate the next steps any subsequent code should take.

For code 2 responses the response body may be split between _buf and whatever remains unread in the socket, if anything, hence the getmore method or the gemini_request utility function.

Parameters include:

bufsize => strictly-positive-integer

Size of buffer to use for requests, 4096 by default. Note that a naughty server may return data in far smaller increments than this.

ssl => { params }

Passes the given parameters to the IO::Socket::SSL constructor. These could be used to configure e.g. the SSL_verify_mode or to set a verification callback, or to specify a custom SNI host via SSL_hostname.

Timeout can be used to set a connect timeout on the socket. However, a server could wedge at any point following, so it may be necessary to wrap a get request with the alarm function or similar.

tofu => boolean

If true, only the leaf certificate will be checked. Otherwise, the full certificate chain will be verified by default, which is probably not what you want when trusting the very first leaf certificate seen.

Also with this flag set hostname verification is turned off; the caller can manage SSL_verifycn_scheme and possibly SSL_verifycn_name via the ssl param if this needs to be customized.

verify_ssl => code-reference

Custom callback function to handle SSL verification. The default is to accept the connection (Trust On All Uses), which is perhaps not ideal. The callback is passed a hash reference containing various information about the certificate and connection.

  ...->get( $url, ..., verify_ssl => sub {
    my ($param) = @_;
    return 1 if int rand 2; # certificate is OK
    return 0;
  } );

Note that some have argued that under TOFU one should not verify the hostname nor the dates (notBefore, notAfter) of the certificate, only to accept the first certificate presented as-is, like SSH does, and to use that certificate thereafter. This has plusses and minuses.

See bin/gmitool for how verify_ssl might be used in a client.

In module version 0.08 the format of the digest (fingerprint) changed to be compatible with the amfora gemini client.

getmore callback [ bufsize => n ]

A callback interface is provided to consume the response body, if any. Generally this should only be present for response code 2. The meta line should be consulted for details on the MIME type and encoding of the bytes; $body in the following code may need to be decoded.

  my $body = '';
  $gem->getmore(
      sub {
          my ( $buffer, $length ) = @_;
          $body .= $buffer;
          return 1;
      }
  );

The bufsize parameter is as for get.

ACCESSORS

Or you can use the faster hash internals, which are not expected to change.

code

Code of the request, 0 to 6 inclusive. Pretty important, so is also returned by get. 0 is an extension to the specification, and is used for connection errors (e.g. host not found) and other problems outside the gemini protocol.

content

The content, if any. Raw bytes. Only if the code is 2 and a suitable gemini_request call has been made.

error

The error message, if any.

host

Host used for the request.

ip

IP address used for the request.

meta

Gemini meta line. Use varies depending on the code.

mime

Only set by gemini_request for 2 code responses; contains an array reference of return values from the parse_mime_type function of Parse::MIME.

port

Port used for the request.

socket

Socket to the server. May not be of much use after getmore is done with, or after gemini_request.

status

Status of the request, a two digit number. Only set when the code is a gemini response (that is, not an internal 0 code).

uri

URI used for the request. Probably could be used with any relative URL returned from the server.

BUGS

None known. But it is a somewhat incomplete module, and the specification may change, too.

SEE ALSO

gemini://gemini.circumlunar.space/docs/specification.gmi (v0.16.1)

gemini://gemini.thebackupbox.net/test/torture

RFC 2045

COPYRIGHT AND LICENSE

Copyright 2022 Jeremy Mates

This program is distributed under the (Revised) BSD License: https://opensource.org/licenses/BSD-3-Clause