Overview

Apache Impala is a massively parallel processing (MPP) SQL query engine for data stored in Apache Hadoop clusters. Key benefits include:

  • Real-time queries: Sub-second query response times on large datasets
  • Native Hadoop integration: Direct access to data in HDFS and Apache HBase
  • ANSI SQL: Standard SQL syntax familiar to analysts and developers
  • High concurrency: Handle multiple simultaneous queries efficiently
  • Compatibility: Works with Hive metastore for shared schema definitions

Impala is ideal for interactive analytics, business intelligence, and ad-hoc querying on Hadoop data without the latency of batch processing.

Connecting

To connect to Apache Impala in DBCode:

  1. Open the DBCode extension in Visual Studio Code and select Add Connection.
  2. Choose Apache Impala from the database type list.
  3. Configure the connection:
    • Host: The Impala daemon (impalad) hostname or IP address
    • Port: Default is 21050 for the HiveServer2 interface
    • Database: The default database to connect to (usually “default”)
  4. Configure authentication (if required):
    • None: No authentication (default for many Impala deployments)
    • LDAP: LDAP-based authentication for secured clusters
  5. Save the connection to start querying your Impala databases.

DBCode Features for Impala

With an Impala connection, DBCode provides:

  • Database Browser: Explore databases, tables, and views
  • Query Editor: Write and execute SQL queries with syntax highlighting
  • Data Grid: View query results with sorting and filtering
  • Table Metadata: View column definitions and table statistics
  • DDL Generation: Generate CREATE TABLE statements

Supported Object Types

  • Databases: Browse and switch between Impala databases
  • Tables: Internal and external tables including Kudu tables
  • Views: SQL views with underlying query definitions

Authentication Options

No Authentication (NoSasl)

The default for many Impala deployments. Connects without SASL authentication.

LDAP Authentication

For secured clusters, authenticate using LDAP credentials.

Learn more about Apache Impala at impala.apache.org.