Contact
Impressum
Why this name?
pdf

Text::VCardFast

NAME

Text::VCardFast − Perl extension for very fast parsing of VCards

SYNOPSIS

  use Text::VCardFast;
  my $hash = Text::VCard::vcard2hash($card, multival => ['adr', 'org', 'n']);
  my $card = Text::VCard::hash2vcard($hash, "\r\n");

DESCRIPTION

Text::VCardFast is designed to parse VCards very quickly compared to pure-perl solutions. It has a perl and an XS version of the same API, accessible as vcard2hash_pp and vcard2hash_c, with the XS version being preferred.

Why would you care? We were writing the calendaring code for fastmail.fm, and it was taking over 6 seconds to draw respond to a request for calendar data, and the bulk was going to the perl middleware layer − and THAT profiled down to the vcard parser.

Two of us independently wrote better pure perl implementations, leading to about a 5 times speedup in each case. I figured it was worth checking if XS would be much better. Here’s the benchmark on the v4 example from Wikipedia:

    Benchmark: timing 10000 iterations of fastxs, pureperl, vcardasdata...
        fastxs:  0 wallclock secs ( 0.16 usr +  0.01 sys =  0.17 CPU) @ 58823.53/s (n=10000)
                (warning: too few iterations for a reliable count)
      pureperl:  1 wallclock secs ( 1.04 usr +  0.00 sys =  1.04 CPU) @ 9615.38/s (n=10000)
    vcardasdata:  8 wallclock secs ( 7.35 usr +  0.00 sys =  7.35 CPU) @ 1360.54/s (n=10000)

(see bench.pl in the source tarball for the code)

EXPORT

  vcard2hash
  hash2vcard

API
Text::VCard::vcard2hash($card, %options);

  Options:
  * only_one − A flag which, if true, means parsing will stop after
    extracting a single VCard from the buffer.  This is very useful
    in cases where, for example, a disclaimer has been added after
    a calendar event in an email.
  * multival − A list of entry names which will be considered to have
    multiple values.  Instead of having a 'value' field in the hash,
    entries with this key will have a 'values' field containing an
    arrayref of values − even if there is only one value.
    The value is split on semicolon, with escaped semicolons decoded
    correctly within each item.
    Default is the empty list.
  * multiparam − As with values − multiparam is a list of entry names
    which can have multiple values.  To see the difference here you
    must consider something like this:
    EMAIL;TYPE="INTERNET,HOME";TYPE=PREF:example AT example DOT com
    If 'multiparam' includes 'TYPE' then the result will be:
    ['INTERNET', 'HOME', 'PREF'], otherwise it will be:
    ['INTERNET,HOME', 'PREF'].
    Default is the empty list.
  * barekeys − if set, then a bare parameter will be considered to be
    a parameter name with an undefined value, rather than a being a
    value for the parameter type.
    Consider:
    EMAIL;INTERNET;HOME:example AT example DOT com
    barekeys off:
    {
      name => 'email',
      params => { type => ['INTERNET', 'HOME'] },
      value => 'example AT example DOT com',
    }
    barekeys on:
    {
      name => 'email',
      params => { internet => [undef], home => [undef] },
      value => 'example AT example DOT com',
    }
    default is barekeys off.
  The input is a scalar containing VFILE text, as per RFC 6350 or the various
  earlier RFCs it replaces.  If the perl unicode flag is set on the scalar,
  then it will be propagated to the output values.
  The output is a hash reference containing a single key 'objects', which is
  an array of all the cards within the source text.
  Each object can have the following keys:
  * type − the text after BEGIN: and END: of the card (lower cased)
  * properties − a hash from name to array of instances within the card.
  * objects − an array of sub cards within the card.
  Properties are a hash with the following keys:
  * group − optional − if the propery name as 'foo.bar', this will be foo.
  * name − a copy of the hash key that pointed to this property, so that
    this hash can be used without keeping the key around too
  * params − a hash of the parameters on the entry.  This is everything from
    the ; to the :
  * value − either a scalar (if not a multival field) or an array of values.
    This is everything after the :
  Decoding is done where possible, including RFC 6868 handling of ^.
  All names, both entry names and parameter names, are lowercased where the
  RFC says they are not case significant.  This means that all hash keys are
  lowercase within this API, as are card types.
  Values, on the other hand, are left in their original case even where the
  RFC says they are case insignificant − due to the increased complexity of
  tracking which version what parameters are in effect.

Text::VCard::hash2vcard($hash, $eol)

  The inverse operation (as much as possible!)
  Given a hash with an 'objects' key in it, output a scalar string containing
  the VCARD representation.  Lines are separated with the $eol string given,
  or the default "\n".  Use "\r\n" for files going to caldav/carddav servers.
  In the inverse of the above case, where names are case insignificant, they
  are generated in UPPERCASE in the card, for maximum compatibility with
  other implementations.

EXAMPLES

  For more examples see the t/cases directory in the tarball, which contains
  some sample VCARDs and JSON dumps of the hash representation.
  BEGIN:VCARD
  KEY;PKEY=PVALUE:VALUE
  KEY2:VALUE2
  END:VCARD
  {
  'objects' => [
    {
      'type' => 'vcard',
      'properties' => {
        'key2' => [
          {
            'value' => 'VALUE2',
            'name' => 'key2'
          }
        ],
        'key' => [
          {
            'params' => {
              'pkey' => [
                'PVALUE'
              ]
            },
            'value' => 'VALUE',
            'name' => 'key'
          }
        ]
      }
    }
  ]
  }
  BEGIN:VCARD
  BEGIN:SUBCARD
  KEY:VALUE
  END:SUBCARD
  END:VCARD
  {
  'objects' => [
    {
      'objects' => [
        {
          'type' => 'subcard',
          'properties' => {
            'key' => [
              {
                'value' => 'VALUE',
                'name' => 'key'
              }
            ]
          }
        }
      ],
      'type' => 'vcard',
      'properties' => {}
    }
  ]
  }
  BEGIN:VCARD
  GROUP1.KEY:VALUE
  GROUP1.KEY2:VALUE2
  GROUP2.KEY:VALUE
  END:VCARD
  {
  'objects' => [
    {
      'type' => 'vcard',
      'properties' => {
        'key2' => [
          {
            'group' => 'group1',
            'value' => 'VALUE2',
            'name' => 'key2'
          }
        ],
        'key' => [
          {
            'group' => 'group1',
            'value' => 'VALUE',
            'name' => 'key'
          },
          {
            'group' => 'group2',
            'value' => 'VALUE',
            'name' => 'key'
          }
        ]
      }
    }
  ]
  }

SEE ALSO

There is a similar module Text::VFile::asData on CPAN, but it is much slower and doesn’t do as much decoding.

Code is stored on github at

https://github.com/brong/Text−VCardFast/

AUTHOR

Bron Gondwana, <brong AT fastmail DOT fm<gt>

COPYRIGHT AND LICENSE

Copyright (C) 2014 by Bron Gondwana

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

pdf
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

No Banana Union - No Software Patents