Specto DataLake (Source)
Read computed Specto DataLake outputs into IMSURGE.
The Specto DataLake (Source) integration reads active computed parquet output from a selected Specto DataLake project, calculation, and pipeline. IMSURGE imports the selected computed dataset as a single-device time-series source for use in your pipelines.
Setup
Prerequisites
Before setting up this integration, obtain the SDL License Key provided by Specto upon account setup.
A valid, unexpired SDL License Key is required for IMSURGE to discover available projects, calculations, and pipelines during setup.
Credential Setup
Use Specto DataLake Credentials to create or select the credential for this integration. The same credential works for both Specto DataLake source and target setups.
Integration Setup
After selecting the Specto DataLake credential, configure the integration in this order:
- Project – Select the Specto DataLake project that IMSURGE reads from.
- Calculation – This field appears after Project is selected. Choose the calculation folder inside the selected project.
- Pipeline – This field appears after Calculation is selected. Options are loaded from non-archive computed parquet outputs for the selected project and calculation.
If you change Project during setup, Calculation and Pipeline are cleared. If you change Calculation, Pipeline is cleared. After the integration is first saved, Project, Calculation, and Pipeline cannot be changed.
Reference
For credential fields, see Specto DataLake Credentials.
Limitations
- Computed output only – IMSURGE reads only the active computed parquet for the selected Project, Calculation, and Pipeline. Raw data,
/pipeline/parquet, and archive data are not read. There is no archive or include-archive option for this integration. - Full-file reads – Each run downloads the full active computed parquet and applies the IMSURGE sync window after download rather than during the object read.
- Timestamped parquet required – The selected parquet must include a timestamp column. IMSURGE treats all other columns as metrics, normalizes timestamps to UTC, drops invalid timestamps, keeps the latest row for duplicate timestamps, and may warn or fail if timestamped data is missing.
- Returned data shape – The integration returns a single device that uses the selected pipeline name as the device identifier. Unit, latitude, and longitude metadata are not added.
- License validity – Invalid or expired SDL licenses prevent setup discovery and stop execution before any data is read.