Data Catalog, Elasticsearch

WHAT YOU'LL LEARN
  1. How to create a Elasticsearch Connection to Lenses
  2. How to handle Permissions and Data Policies with Elasticsearch.
  3. How to search and/or View Data from Elasticsearch Indices.
  4. F.A.Q and troubleshooting guides, in case you get stuck somewhere.

Introduction

Lenses Data Catalog, can preview data present on Elasticsearch Indices. Using the Data Catalog, you can assign Metadata(Description, Tags) to Elasticsearch, giving your users the ability to surface relevant information faster, all while keeping your data secure and compliant.

Connecting Elasticsearch

In order to connect to an Elasticsearch, we need to connect Lenses to one or more Elasticsearch instances. For Lenses that is achieved by creating a Connection.

Creating a Connection

Given you have the necessary permissions, ManageConnections, in this case, navigate to Connections and click Add Connection. There you will find an option to connect among others a Elasticsearch instance. Click it and you will be re-directed to a form. We require several parameters to establish a connection.

Form ParameterDescriptionRequiredNotes
NameThe name of the connectionYESString between 1-127 Characters.
TagsMetadata for your connectionNO
UsernameThe user to connect with.NO
PasswordThe password for that userNO
NodesThe security of the connectionYESArray of strings
USEFUL TIP
The connection allows the client to specify multiple hosts from a cluster. This means that if one node goes down, Lenses can still connect to another.

Once a successful connection is established, it can be viewed in the Explore Page alongside the other available Data Sources. If you are having trouble connecting Lenses to Elasticsearch, please refer to our Troubleshooting Guide and/or F.A.Qs. .

Security Recommendations

In order to minimise security risks, we recommend to connect to your ES instance through a READ ONLY user. This can be created with the following command, by a user with the appropriate permissions.

curl -XPOST -u elastic '<URL>/_security/role/read_only' -H "Content-Type: application/json" -d '{
  "indices" : [
    {
      "names" : [ "*" ],
      "privileges" : [ "read" ]
    },
  ]
}'

Security & Governance

As with every other source (Kafka, PostgreSQL until now), Elasticsearch, is subject to Lenses RBAC permissions, and policies for mathcing fields. Keep in mind that Lenses RBAC, are completely independant from your Elasticsearch instance’s permissions.

Lenses RBAC

Lenses is using a RBAC(Role Based Access Control) permissioning system, to allow for granular control across all your sources(Kafka, Elasticseach and PostgreSQL) [1] by creating Groups with the appropriate permissions [2].

Elasticsearch Group

You can then assign either Users or Service Accounts to those Groups. Specifically, for Elasticsearch, we provide 4 Permissions.

Form ParameterDescription
ShowIndexCan view the Index, but cannot query Data or Schema
QueryIndexCan view the Index and query its Data
ViewSchemaCan view the Index and its Schema
UpdateMetadataCan view the Index and its Metadata

Data Governance

Data policies enable compliance with regulations such as GDPR, CCPA, or HIPAA. We use Data Policies to obfuscate data retrieved from Lenses via the UI, CLI, or API without affecting how the underlying data is stored [1]. That ability is obviously extended to Elasticsearch, alongside Kafka and PostgreSQL.

Elasticsearch Policy

When we apply the policy, Lenses will automatically obfuscate all fields that match. The Explore screen, will also notify you for any Datasets that policies are applied to. More on policies .

UI

Lenses UI, can preview PostgreSQL Tables and Views, in the 2 places:

  1. SQL Studio, where you can query for Elasticsearch data.
  2. Explore, where you can search for Dataset’s metadata and view details for a Dataset.

Data Catalog

USEFUL TIP
As described above, permissions are applied for all clients, thus users will only see and interact with the datasets they have access to (i.e: Indices in the case of Elasticsearch).

In the Data Catalog, a user is able to search for terms [1] based on Metadata(Tags, Description and Field Names) [2]. He can see the Name of the Dataset, along with their mathcing fields, their Type and Description if it exists, and if they are protected by a policy [3].

Elasticsearch search on Explore

Once we identify, our Dataset of choice, we can drill deeper, and navigate to the Details View. There we can see Data [1], Schema, Shards and Metadata information [2, 3]. We can specify, the number of items, we want to include in our query, view the results in Tree or Table View and see the Schema for each individual field.

Elasticsearch Details - Data

SQL Studio

The Lenses SQL Studio provides a familiar query editor that allows writing Lenses SQL Queries to retrieve results from individual Data Sources, such as Kafka, Elasticsearch and PostgreSQL. You can also Download the results and preview them in either Tree or Table View.

Elastisearch SQL Studio

Troubleshooting & F.A.Qs

If you have any question, please refer to the following list. If you still have more, we are more than welcome to answer any question at our community channel.

How can I view the status of my connection?

Currently, Lenses UI provides a visual indication of the connection health status. Unfortunately, that is only visible in the Explore screen.However, uncovering connection problems should be fairly easy. You can inspect the Lenses Logs and search for ERROR entries related to your connection. Learn more

What versions of Elasticsearch does Lenses support?

We offically support Elasticsearh versions ranging from 6 to 7 with all minor and patch releases for each version.

How many Elasticsearch connections can Lenses handle?

A Lenses on a Kubernetes cluster with 4GB of RAM can handle up to 10 Elasticsearch connections with approximetaly 10K Indices with good responsiveness.

Why is my Elasticsearch Node ‘Yellow’, even when it’s healthy?

Note that in a single node cluster, even a healthy index will always have a Yellow status. That doesn’t mean everything is NOT OK, however your replica rules are not satisfied. The status will update to green once your cluster contains enough nodes to allocate the replicas, even if Lenss is connected to just one node.

What Elasticsearch features are not supported currently?

The main limitation of Lenses SQL for Elasticsearch, is its ability to convert SQL Statements to nto the query language expected by the Elasticsearch REST query interface. If you have a particular use-case or feedback for our support on Elasticsearch, submit your request to our productboard

Does Lenses SQL support write operations for Elasticsearch Indices?

Support for write operations (e.g. inserting/updating records, creating/altering indices, etc) is not planned for the immediate future.