Highly Optimized Protocol Buffer Serializers

Build Status AppVeyor Build Status CRAN_Status_Badge CRAN RStudio mirror downloads Github Stars

Pure C++ implementations for reading and writing several common data formats based on Google protocol-buffers. Currently supports ‘rexp.proto’ for serialized R objects, ‘geobuf.proto’ for binary geojson, and ‘mvt.proto’ for vector tiles. This package uses auto-generated C++ code by protobuf-compiler, hence the entire serialization is optimized at compile time. The ‘RProtoBuf’ package on the other hand uses the protobuf runtime library to provide a general-purpose toolkit for reading and writing arbitrary protocol-buffer data in R.

RProtoBuf vs protolite

This small package contains optimized C++ implementations for reading and writing several common data formats based on Google protocol-buffers. Currently it supports rexp.proto for serialized R objects, geobuf.proto for geojson data, and mvt.proto for reading Mapbox vector tiles.

To extend the package with additional formats, put your .proto file in the src directory. The package configure script will automatically generate the code and header file to include in your C++ bindings.

The protolite package is much faster than RProtoBuf because it binds directly to generated C++ code from the protoc compiler. RProtoBuf on the other hand uses the more flexible but slower reflection-based interface, which parses the descriptors at runtime. With RProtoBuf you can create new protocol buffers of a schema, read in arbitrary .proto files, manipulate fields, and generate / parse .prototext ascii format protocol buffers. For more details have a look at our paper: RProtoBuf: Efficient Cross-Language Data Serialization in R.

Serializing R objects

# Serialize and unserialize an object
library(protolite)
buf <- serialize_pb(iris)
out <- unserialize_pb(buf)
stopifnot(identical(iris, out))

# Fully compatible with RProtoBuf
buf <- RProtoBuf::serialize_pb(iris, NULL)
out <- protolite::unserialize_pb(buf)
stopifnot(identical(iris, out))

# Other way around
buf <- protolite::serialize_pb(mtcars, NULL)
out <- RProtoBuf::unserialize_pb(buf)
stopifnot(identical(mtcars, out))

Converting between GeoJSON and GeoBuf

Use the countries.geo.json example data:

# Example data
download.file("https://github.com/johan/world.geo.json/raw/master/countries.geo.json",
  "countries.geo.json")

# Convert geojson to geobuf
buf <- json2geobuf("countries.geo.json")
writeBin(buf, "countries.buf")

# The other way around
geobuf2json(buf) #either in memory
geobuf2json("countries.buf") #or from disk

# Read directly from geobuf
mydata <- read_geobuf("countries.buf")

Installation

Binary packages for OS-X or Windows can be installed directly from CRAN:

install.packages("protolite")

Installation from source on Linux or OSX requires Google’s Protocol Buffers library. On Debian or Ubuntu install libprotobuf-dev and protobuf-compiler:

sudo apt-get install -y libprotobuf-dev protobuf-compiler

On Fedora we need protobuf-devel:

sudo yum install protobuf-devel

On CentOS / RHEL we install protobuf-devel via EPEL:

sudo yum install epel-release
sudo yum install protobuf-devel

On OS-X use protobuf from Homebrew:

brew install protobuf