TABLE OF CONTENTS


Installation scripts

On premise (click to download)


Step 1) Download installation resources

wget -O conduit-2.6.0.tar.gz 'https://bpartifactstorage.blob.core.windows.net/conduit-artifacts-public/2.6.0/conduit-2.6.0.tar.gz'

Step 2) Download installation manager

wget -O conduit-transformless-install-onprem.sh  'https://bpartifactstorage.blob.core.windows.net/conduit-artifacts-public/2.6.0/conduit-transformless-install-onprem.sh'

Step 3) Follow instructions from Installing on VM (any cloud or on-premise services)


Release summary

We are proud to announce the next release of Conduit with the following major features and improvements:


SFTP Connector

We are excited to announce a new file connector: SFTP. 

With this release you are now able to use Conduit to migrate, query and virtualize your datasets that sit behind SFTP servers. SFTP connector is production ready.


Materialization Destinations

Another major feature that we are proud to introduce is the Materialization Destinations.

This introduces support for materializing data (i.e. caching data) to different destinations per connector.


Data Store

Parquet store is renamed to Data store.


High Availability

With this release Conduit is able to run in High Availability mode. This means that you can run multiple Conduit instances behind a load balancer in order to provide system resilience.

See more information regarding Conduit deployment modes at Conduit Deployment Diagrams .


Conduit can be run in High Availability modes in the following scenarios:

  • on premise
  • Google Cloud
  • Azure

Please contact us for support regarding installing and running Conduit in High Availability mode.


AWS RDS Connectors

To improve user experience and facilitate the usage of cloud connectors without needing to worry about complex configurations related to the cloud provider, each supported database will have its dedicated connector type for the corresponding cloud provider.

With this release we introduce the AWS RDS databases:

  • AWS mssql
  • AWS mysql
  • AWS oracle
  • AWS postgres
  • AWS mariaDB
  • AWS Aurora MySQL
  • AWS Aurora Postgres


Complete release notes


SFTP Connector

[CONDUITV3-1470] SFTP connector (#2595)



Data store

Data store was previously known as Parquet Store.

[CONDUITV3-XYZ] TriggeredBy in materialization audit is always null

[CONDUITV3-1845] Restart materialization in data store caused loop reset (#2577)

[CONDUITV3-1880] Change endpoint path from "/parquet-cache" to "/parquet-store" (#2597)

[CONDUITV3-1877] Entry in parquet store does not change status to 'Not Started' when virtual dataset is removed, if "materialize now" was enabled

[CONDUITV3-785] Query cache fails to expire when "Query Caching" is unchecked if table names have capital letters


Materialization Destinations

Each Destination has also a configurable option to select which data storage format to use when materializing the dataset.


These storage formats are:

  • parquet
  • delta table

In most cases, the Destination will contain data from one dataset. Destination is intended to be a used by other systems as input. Therefore, is highly unlikely to store multiple data formats in the same folder. Each Destination in practice will be logically independent to other Destinations with regards to data content. 



Destination types:

  • Azure Blob 
  • Azure Government 
  • AWS S3
  • Google Cloud
  • File System


Materialization Destination is configurable per connector if Connector Materialization is enabled.

This Destination will be used by default for all dataset tables selected at the previous step.

In addition, as an advanced configuration, the user can also override the Destination setting for each table in particular at the Advanced step in the Connector configuration wizzard.


High Availability


Service Management 

[CONDUITV3-1185] -[UI] - Individual VM nodes api changes and cluster view with disabled restart/check status functionality (#2236)

[CONDUITV3-843] - Conduit service High Availability - Clustered mode (#2177)

[CONDUITV3-1177] - Service Management error displayed at first login on a clean server (#2257)

[CONDUITV3-1050] - Implement the monitoring of VMs and services as a cluster from Conduit (#2237)

[CONDUITV3-1524] - [ServiceManagement] Conduit Master - display spark and hdfs namenode status when available  (#2393)

[CONDUITV3-1524]  [ServiceManagement] Conduit Master - display spark and hdfs namenode status when available (#2386)

[CONDUITV3-1050] Fix api for monitoring a specific service from the cluster (#2281)

[CONDUITV3-1525] - [ServiceManagement] Conduit Worker - display spark worker and hdfs datanode status when available (#2404)

[CONDUITV3-1503] - [UI] display which Conduit VM is the leader in the serviceManagement view 


Arrow Flight Server

[CONDUITV3-1097] - Implement running queries through Arrow FlightServer for each connector when Spark queries encountered on Slave nodes (#2262)


Installation

[CONDUITV3-1261] update in agent-service the default value of conduit_fs_type for existing and new production deployment 

[CONDUITV3-1496]  Create Azure resources then automatically start installing Conduit (#2412)


Data Catalog

[CONDUITV3-1490] - [HA] Data catalog assets are processed by all the nodes in a ha environment (#2402)


Error Handling

[CONDUITV3-1512] - [UI] improve 502 error message when Conduit is in HA mode (#2464)


Miscellaneous

[CONDUITV3-xyz] Rename DB_NAME to CONNECTOR_NAME in Parquet API Swagger

[CONDUITV3-1886] "Connection pool shutdown" error on s3 configuration

[CONDUITV3-1851] fixes "Failed to update query audit for query" error during query execution

[CONDUITV3-1878] Duration of caching reported in logs differs from the one in Data Store status

[CONDUITV3-1264] - Separate Aurora connectors in UI for mysql and postgres libraries (#2607)

[CONDUITV3-1940] - Display cancel dialog for metadata query (explore query) after 5 seconds (#2606)

[CONDUITV3-1914]  Support materialization destination per connector (#2600)


Related articles