Latest version: 4.3.x
Data Catalog, PostgreSQL
- How to create a PostgreSQL Connection to Lenses
- How to handle Permissions and Data Policies with PostgreSQL.
- How to search and/or View Data from PostgreSQL Table and Views.
- F.A.Q and troubleshooting guides, in case you get stuck somewhere.
Introduction
Lenses Data Catalog, can preview data present on PostgreSQL Tables/Views. Using the Data Catalog, you can assign Metadata(Description
, Tags
) to PostgreSQL, giving your users the ability to surface relevant information faster, all while keeping your data secure and compliant.
Connecting PostgreSQL
In order to connect to a PostgreSQL DB, we need to connect Lenses to the DB. A very similar experience to connecting
using tools like PgAdmin
or Datagri
. For Lenses that is achieved by creating a Connection
.
Creating a Connection
Given you have the necessary permissions, ManageConnections
, in this case, navigate to Connections
and click Add Connection
.
There you will find an option to connect among others a PostgreSQL
DB. Click it and you will be re-directed to a form. We require
several parameters to establish a connection.
Form Parameter | Description | Required | Notes |
---|---|---|---|
Name | The name of the connection | YES | String between 1-127 Characters. Only dashes are accepted . |
Tags | Metadata for your connection | NO | |
Host | The PostgreSQL DB hostname | YES | |
Port | The port for the PostgreSQL | YES | |
DB Name | The name of the DB | YES | |
Username | The user to connect with. | YES | |
Password | The password for that user | NO | |
SSL Mode | The security of the connection | YES | Enum (Disable, Allow, Prefer, Require, Verify-ca, Verify-full) |
Once a successful connection is established, it can be viewed in the Explore Page alongside the other available Data Sources. If you are having trouble connecting Lenses to PostgreSQL, please refer to our Troubleshooting Guide and/or F.A.Qs. .
Security Recommendations
In order to minimise security risks, we recommend to connect to your database through a READ ONLY user. This can be created with the following commands, by a DB User with the appropriate permissions. If your DB contains other
schemas
than the default one (i.e. public
), you might have to grant individual access for each of them.
CREATE USER lensesreadonly WITH ENCRYPTED PASSWORD 'mypassword';
GRANT SELECT ON ALL TABLES in schema public to lensesreadonly;
Make sure to refer to the Postgres error log in order to identify failing queries and debug permission issues.
Security & Governance
As with every source(Kafka, Elasticsearch until now), PostgreSQL, is subject to Lenses RBAC permissions, and policies for mathcing fields. Keep in mind that Lenses RBAC, are completely independant from your DBs permissions.
Lenses RBAC
Lenses is using a RBAC(Role Based Access Control) permissioning system, to allow for granular control across all your sources(Kafka
, Elasticseach
and PostgreSQL
) [1] by creating Groups with the appropriate permissions [2].
You can then assign either Users
or Service Accounts
to those Groups. In particular, for PostgreSQL, we provide 4 Permissions.
Form Parameter | Description |
---|---|
ShowTable | Can view the Table, but cannot query Data or Schema |
QueryTable | Can view the Table and query its Data |
ViewSchema | Can view the Table and its Schema |
UpdateMetadata | Can view the Table and its Metadata |
Data Governance
Data policies enable compliance with regulations such as GDPR, CCPA, or HIPAA. We use Data Policies
to obfuscate data retrieved from Lenses via the UI, CLI, or API without affecting how the underlying data is stored [1]. That ability is obviously extended to PostgresSQL [2], alongside Kafka
and Elasticsearch
.
When we apply the policy, Lenses will automatically obfuscate all fields that match. The Data Exploration screen, will also notify you for any Datasets that policies are applied to. More on policies .
UI
Lenses UI, can preview PostgreSQL Tables and Views, in the 2 places:
- SQL Studio, where you can query for PostgreSQL data.
- Explore, where you can search for datasets metadata and view details of a datasets.
Data Catalog
In the Data Catalog, a user is able to search for terms [1] based on Metadata(Tags, Description and Field Names) [2]. He can see the Name
of the Dataset, along with their mathcing fields, their Type
and Description
if it exists, and if they are protected by a policy [3].
Once we identify, our Dataset of choice, we can drill deeper, and navigate to the Details View. There we can see Data, Schema and Metadata information. We can specify, the number of items, we want to include in our query, view the results in Tree or Table View and see the Schema for each individual field.
SQL Studio
The Lenses SQL Studio provides a familiar query editor that allows writing Lenses SQL
Queries to retrieve results from individual Data Sources, such as Kafka, Elasticsearch and PostgreSQL. You can also Download
the results and preview them in either Tree or Table View.
Troubleshooting & F.A.Qs
If you have any question, please refer to the following list. If you still have more, we are more than welcome to answer any question at our community channel.
How can I view the status of my connection?
Currently, Lenses UI provides no visual indication of a connection health status. However, troubleshooting connection problems should be fairly straightforward and can be done via by inspecting the Lenses logs and searching for ERROR entries related to your connection. For more on Logs
Why Lenses is displaying inconsistent record counts?
In order to minimise load on the database, Lenses relies on the Postgres’ Statistics Collector to extract record counts. The freshness of such counts is affected by the volume of the data held in the db and by how routine vacuuming is configured. While Postgres provides various settings to control vacuuming globally, it is also possible to tweak this mechanism on a single table basis. For instance, one can aggressively reduce the autovacuum threshold down to zero by setting the following:
ALTER TABLE public.employee SET (autovacuum_vacuum_scale_factor = 0, autovacuum_vacuum_threshold = 0);
Finally, one can vacuum a table on a one off basis, or on a schedule (by using cron or an equivalent tool) by directly issuing a vacuum command . Please keep in mind that VACUUM creates a substantial amount of I/O traffic, which can cause poor performance for active query sessions.
What versions of PostgreSQL does Lenses support?
We offically support PostgreSQL version ranging from 9.6 to 12.5.
How many PostgreSQL connections can Lenses handle?
A Lenses on a Kubernetes cluster with 4GB of RAM can handle up to 10 PostgresSQL DBs with approximetaly 10K Table and/or Views with good responsiveness.
What PostgreSQL features are not supported currently?
We are not supporting yet, but we plan to in the upcoming release to support JOIN
, Aggregations (i.e: GROUP BY, HAVING)
and calling Native
PostgreSQL functions. If you have a particular use-case or feedback for our support on PostgreSQL, submit your request to our
productboard
Does Lenses SQL support write operations for PostgreSQL Tables?
Support for write operations (e.g. inserting/updating records, creating/altering tables, etc) is not planned for the immediate future.