Google Cloud 2018
Google Cloud
A completely uncurated list of every Google Cloud blog post from 2018.
Beam BigQuery annotation
Dataflow, Apache Beam, BigQuery
Interacting with BigQuery from Apache Beam/Dataflow requires use of the TableRow
class from the Google
API client libraries to read/write data field by field, as well managing table schemas when writing data. The
BigQuery Utilities library
from Windfall Data provides a workflow to streamline this process. Perhaps the most valuable benefit of this library is that
it reduces the risk of annoying and potentially time consuming mistakes when managing BigQuery schema definitions.
The @BigQueryColumn
annotation is used to define the table schema. Schema modifications introduced when the annotation
usage is changed are handled by the library. The library also provides a transform to read or write data, avoiding the boilerplate
associated with TableRow
. Collections are handled and mapped to repeated columns.
See the github repo for more.