nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Integrating Board with Google BigQuery

Center of Excellence

1. Abstract

This document describes the recommended approach for integrating Google BigQuery with Board using the Progress DataDirect JDBC connector within the Data Pipeline. It outlines the integration architecture, configuration steps, and data import methods, positioning BigQuery as the source of truth and Board as the analytics and planning layer. Key considerations include authentication via service accounts, performance optimization, security governance, and best practices for efficient data loading. The integration enables organizations to combine BigQuery’s scalability with Board’s modeling and reporting capabilities to create a robust, cloud-based analytics environment.

2. Context

As with many other integrations in Board, BigQuery is a widely used data warehouse platform that is lately increasingly being integrated with Board. While the integration process using the direct connector is relatively straightforward, it does present certain challenges.

For those approaching their first integration, it may raise several questions, so where should you begin your discussions with the customer when integrating with BigQuery?

A recommended integration approach is to use the Google BigQuery connection via the Progress DataDirect JDBC connector, which runs within the Data Pipeline. Please note that this is not a native Board connector, but a third-party connector provided by Progress.

The configuration process starts in the Data Pipeline, where the connection is set up. Next, you define the connection alias in the Cloud Admin portal. Once these steps are completed, you can create the Data Source in the Board System Administration portal.

Using the Board BigQuery Connector, data can be imported efficiently into our Board data models, allowing users to analyse, plan, and report on large volumes of cloud-based data within a governed environment.

3. Content

3.1 Architecture Flow

The typical integration flow looks like this:

BigQuery hosts datasets and tables in Google Cloud.
Service account credentials are used by the Datapipeline JDBC driver to authenticate to BigQuery
The Datapipeline executes queries against BigQuery. Datapipeline is the query execution layer.
Data is loaded into Board entities and cubes via the SQL data readers
Board users analyse and consume the data through Capsules, Screens, and Presentations.

This approach keeps BigQuery as the source of truth while Board serves as the analytics and planning layer.

3.2 Configuring the Connection

Before setting up the integration, ensure that the following prerequisites are in place: a Google Cloud project with BigQuery enabled, the appropriate permissions assigned, and access to the required datasets or tables to be queried.

3.2.1. A Service Account in Google Cloud needs to be created – This first step, in particular, must be completed by the customer’s administrator. Below are the guidance steps along with the mandatory files required for Board configuration:

Generate a service account in the Google Cloud Console
Assign the required BigQuery roles.
Download the JSON key file, which will be used in Board for authentication.

3.2.2. Set Up the Data Source in Board Data pipeline

In Board Data pipeline, create a new Data Source and select Google BigQuery as the connector type.
Choose your authentication method (most common one is a Service Account)
Upload the service account JSON key
Specify the Project ID, Dataset and Location (if applicable)
Test the connection to validate access

These fields are mandatory to establish connectivity. Once the connection has been validated and the data integration performance assessed, additional advanced settings can be configured to optimize performance further.

3.3 Importing Data into Board

Once the connection is established, data can be imported using Data Readers. You will be able to browse the configured metadata and view the available Projects and Datasets. It is therefore important to confirm with the customer which tables contain the data to be imported into Board.

BigQuery Terminology Mapping

In Google BigQuery, the terminology differs slightly from traditional relational databases:

A Project is equivalent to a database.
A Dataset corresponds to a schema.
A Table remains the same.

Understanding this mapping helps avoid confusion when browsing metadata in the Data Pipeline and when writing SQL queries, especially since fully qualified schema references may not always be required or supported by the JDBC driver.

Supported Import Methods

Table-based imports: Directly read from BigQuery tables.
SQL-based imports: Use custom SQL queries for transformations, joins, or filtering.

Mapping to Board Structures

BigQuery fields are mapped to:
- Entities (dimensions)
- Cubes (measures)

Data types should be carefully reviewed to ensure compatibility and optimal performance.

3.4 Considerations and Challenges

Performance

When integrating with BigQuery, consider the following best practices:

Use SQL queries to pre-aggregate data where possible
Limit the number of rows returned during imports
Avoid SELECT *; explicitly define required fields
Schedule imports during off-peak hours to reduce load
Partition and cluster BigQuery tables to improve query performance

Security and Governance

Authentication is handled securely via Google service accounts
Board respects role-based security once data is loaded
Sensitive data can be filtered or masked during the import process
Access to BigQuery remains controlled by Google Cloud IAM policies

Challenges

Large datasets leading to long load times
Data type mismatches between BigQuery and Board
SQL complexity affecting performance
Cost implications from frequent BigQuery queries

Best Practices

Request to have views into BigQuery which match the Board Data Model
Optimize SQL and reduce data volume early
Design Board data models aligned with BigQuery schemas
Use incremental loads where possible
Monitor query costs in Google Cloud

4. Conclusion

The Progress DataDirect BigQuery JDBC connector provides a powerful and scalable way to integrate cloud-native data with enterprise planning and analytics. By combining BigQuery’s performance and scalability with Board’s modeling, analysis, and presentation capabilities, organizations can create a robust, end-to-end analytics ecosystem.

Find more posts tagged with

Integrations

Data Integration

Comments

Center of Excellence

Thanks to @Mirela Simion for the insights on this subject.