Querying Databricks Data
Audience: Data Users
Content Summary: This page offers a tutorial on how to query data within the Databricks integration.
Prerequisites:
Query Data with Python
- Create a new workspace.
- Query the Immuta-protected data, which takes the form of
database.table_name
:- Database: The database that houses the backing tables of your Immuta data sources.
- Table Name: The name of the table backing your Immuta data sources.
-
Run your query, it should look something like:
df = spark.sql('select * from database.table_name') df.show()
Query Data with SQL
- Create a new workspace.
- Query the Immuta-protected data, which takes the form of
database.table_name
:- Database: The database that houses the backing tables of your Immuta data sources.
- Table Name: The name of the table backing your Immuta data sources.
-
Run your query. It should look something like this:
select * from database.table_name;
Query Data with SparkR
Establish the User's Identity
- Create a new workspace.
-
Run:
library(SparkR)
Run a Query
- In the same workspace, but a different cell, query the Immuta-protected data, which takes the form of
database.table_name
:- Database: The database that houses the backing tables of your Immuta data sources.
- Table Name: The name of the table backing your Immuta data sources.
-
Run your query. It should look something like this:
df <- SparkR::sql("select * from database.table_name") SparkR::head(df)
Query Data with Scala
- Query the Immuta-protected data, which takes the form of
database.table_name
:- Database: The database that houses the backing tables of your Immuta data sources.
- Table Name: The name of the table backing your Immuta data sources.
-
Run your query. It should look something like this:
val sqlDF = spark.sql("select * from database.tablename") sqlDF.show()