Data Catalog, PostgreSQL

WHAT YOU'LL LEARN
  1. How to create a PostgreSQL Connection to Lenses
  2. How to handle Permissions and Data Policies with PostgreSQL.
  3. How to search and/or View Data from PostgreSQL Table and Views.
  4. F.A.Q and troubleshooting guides, in case you get stuck somewhere.

Introduction

Lenses Data Catalog, can preview data present on PostgreSQL Tables/Views. Using the Data Catalog, you can assign Metadata(Description, Tags) to PostgreSQL, giving your users the ability to surface relevant information faster, all while keeping your data secure and compliant.

PostgreSQL on Data Explortion

Connecting PostgreSQL

In order to connect to a PostgreSQL DB, we need to connect Lenses to the DB. A very similar experience to connecting using tools like PgAdmin or Datagri. For Lenses that is achieved by creating a Connection.

Creating a Connection

Given you have the necessary permissions, ManageConnections, in this case, navigate to Connections and click Add Connection. There you will find an option to connect among others a PostgreSQL DB. Click it and you will be re-directed to a form. We require several parameters to establish a connection.

Form ParameterDescriptionRequiredNotes
NameThe name of the connectionYESString between 1-127 Characters. Only dashes are accepted .
TagsMetadata for your connectionNO
HostThe PostgreSQL DB hostnameYES
PortThe port for the PostgreSQLYES
DB NameThe name of the DBYES
UsernameThe user to connect with.YES
PasswordThe password for that userNO
SSL ModeThe security of the connectionYESEnum (Disable, Allow, Prefer, Require, Verify-ca, Verify-full)
USEFUL TIP
Please refer to the official PostgreSQL Documentation for an in depth explanation of the supported SSL Mode values and how they might affect your connection security.

Once a successful connection is established, it can be viewed in the Explore Page alongside the other available Data Sources. If you are having trouble connecting Lenses to PostgreSQL, please refer to our Troubleshooting Guide and/or F.A.Qs. .

Security Recommendations

In order to minimise security risks, we recommend to connect to your database through a READ ONLY user. This can be created with the following commands, by a DB User with the appropriate permissions. If your DB contains other schemas than the default one (i.e. public), you might have to grant individual access for each of them.

CREATE USER lensesreadonly WITH ENCRYPTED PASSWORD 'mypassword';
GRANT SELECT ON ALL TABLES in schema public to lensesreadonly;

Make sure to refer to the Postgres error log in order to identify failing queries and debug permission issues.

Security & Governance

As with every source(Kafka, Elasticsearch until now), PostgreSQL, is subject to Lenses RBAC permissions, and policies for mathcing fields. Keep in mind that Lenses RBAC, are completely independant from your DBs permissions.

Lenses RBAC

Lenses is using a RBAC(Role Based Access Control) permissioning system, to allow for granular control across all your sources(Kafka, Elasticseach and PostgreSQL) [1] by creating Groups with the appropriate permissions [2].

PostgreSQL Group

You can then assign either Users or Service Accounts to those Groups. In particular, for PostgreSQL, we provide 4 Permissions.

Form ParameterDescription
ShowTableCan view the Table, but cannot query Data or Schema
QueryTableCan view the Table and query its Data
ViewSchemaCan view the Table and its Schema
UpdateMetadataCan view the Table and its Metadata

Data Governance

Data policies enable compliance with regulations such as GDPR, CCPA, or HIPAA. We use Data Policies to obfuscate data retrieved from Lenses via the UI, CLI, or API without affecting how the underlying data is stored [1]. That ability is obviously extended to PostgresSQL [2], alongside Kafka and Elasticsearch.

PostgreSQL Group

When we apply the policy, Lenses will automatically obfuscate all fields that match. The Data Exploration screen, will also notify you for any Datasets that policies are applied to. More on policies .

UI

Lenses UI, can preview PostgreSQL Tables and Views, in the 2 places:

  1. SQL Studio, where you can query for PostgreSQL data.
  2. Explore, where you can search for datasets metadata and view details of a datasets.

Data Catalog

USEFUL TIP
As described above, permissions are applied for all clients, thus users will only see and interact with the datasets they have access to (i.e: table/views in the case of PostgeSQL).

In the Data Catalog, a user is able to search for terms [1] based on Metadata(Tags, Description and Field Names) [2]. He can see the Name of the Dataset, along with their mathcing fields, their Type and Description if it exists, and if they are protected by a policy [3].

PostgreSQL Group

Once we identify, our Dataset of choice, we can drill deeper, and navigate to the Details View. There we can see Data, Schema and Metadata information. We can specify, the number of items, we want to include in our query, view the results in Tree or Table View and see the Schema for each individual field.

PostgreSQL Details - Data

SQL Studio

The Lenses SQL Studio provides a familiar query editor that allows writing Lenses SQL Queries to retrieve results from individual Data Sources, such as Kafka, Elasticsearch and PostgreSQL. You can also Download the results and preview them in either Tree or Table View.

IMPORTANT INFO
For PostgreSQL we support a limited subset of the SQL Queries geared towards the use case of secure data exploration. Currently this only includes simple `SELECT` queries with `WHERE` clauses and basic sorting.
PostgreSQL SQL Studio

Troubleshooting & F.A.Qs

If you have any question, please refer to the following list. If you still have more, we are more than welcome to answer any question at our community channel.

How can I view the status of my connection?

Currently, Lenses UI provides no visual indication of a connection health status. However, troubleshooting connection problems should be fairly straightforward and can be done via by inspecting the Lenses logs and searching for ERROR entries related to your connection. For more on Logs

Why Lenses is displaying inconsistent record counts?

In order to minimise load on the database, Lenses relies on the Postgres' Statistics Collector to extract record counts. The freshness of such counts is affected by the volume of the data held in the db and by how routine vacuuming is configured. While Postgres provides various settings to control vacuuming globally, it is also possible to tweak this mechanism on a single table basis. For instance, one can aggressively reduce the autovacuum threshold down to zero by setting the following:

ALTER TABLE public.employee SET (autovacuum_vacuum_scale_factor = 0, autovacuum_vacuum_threshold = 0);

Finally, one can vacuum a table on a one off basis, or on a schedule (by using cron or an equivalent tool) by directly issuing a vacuum command . Please keep in mind that VACUUM creates a substantial amount of I/O traffic, which can cause poor performance for active query sessions.

What versions of PostgreSQL does Lenses support?

We offically support PostgreSQL version ranging from 9.6 to 12.5.

How many PostgreSQL connections can Lenses handle?

A Lenses on a Kubernetes cluster with 4GB of RAM can handle up to 10 PostgresSQL DBs with approximetaly 10K Table and/or Views with good responsiveness.

What PostgreSQL features are not supported currently?

We are not supported yet but we plan to in the upcoming release to support, JOIN, Aggregations (i.e: GROUP BY, HAVING) and calling Native PostgreSQL functions. If you have a particular use-case or feedback for our support on PostgreSQL, submit your request to our productboard

Does Lenses SQL support write operations for PostgreSQL Tables?

Support for write operations (e.g. inserting/updating records, creating/altering tables, etc) is not planned for the immediate future.