Data Integration Process

Background

Loops is a no-code platform that works on top of your existing data warehouse.

It runs the analyses in your environment and does not store user-level data (unless explicitly requested)

We currently support the following:

    • Google BigQuery 
    • AWS Athena
    • AWS Redshift
    • Vertica
    • Snowflake
    • Databricks SQL
    • PostgreSQL
    • ClickHouse
    • Mixpanel, Amplitude, Google Analytics/Firebase 
    • Heap, Pendo, Posthog, Segment, Braze -depends on the access and tier. 

    In case your company uses a different warehouse, please let us know.

    A few general notes to take into consideration: 

    • Loops run queries on your data warehouse
    • You can share with us the original tables or views without sensitive information and PII
    • Suppose any limitations exist on connecting from a specific network. In that case, you can either open access to our production IP addresses or give us access through a VPN (depending on the VPN technology and permissions).

    A typical integration process takes several days and includes the following steps:

      1. Getting Access to your data warehouse/analytics solution. 
      2. 2-3 set-up sessions to align on your definitions, KPIs, etc. hese sessions are led by Loops’ experienced data analyst and are slated to guarantee that the opportunities align with the metrics you want to move.
      3. Sharing access to Loops’ platform.

    Below is the integration process for each platform: 

    Big Query

    We would need access to your data project with the following permissions: Please provide a service account to integrations@getloops.ai.

    • BigQuery.Editor
    • “bigquery.readsessions.create” – To be used with BigQuery Storage Read API for faster data retrieval
    • “bigquery.tables.list” – Used in the discovery process and to monitor schema changes

    Snowflake

    If your warehouse is not publicly available (VPC), you can whitelist our production environment IP. 

    Please share with us a connection string for your Snowflake cluster and a relevant warehouse. 

    Redshift

    Access must be granted to Loops’ user account with the relevant access to the tables and views of the Redshift cluster.

    If your warehouse is not publicly available (VPC), you can whitelist our production environment IP. 

    Clickhouse

    Please provide us with the following connection parameters:

    • Host – hostname with running ClickHouse server.
    • Port – port ClickHouse server is bound to. Defaults to 9000 if the connection is not secured and to 9440 if the connection is secured.
    • Database – database name.
    • User – username which will be used for connection
    • Password – user’s password.

    Athena

    Please provide us with the credentials.

    Mixpanel/Amplitude

    • As the ability to query Mixpanel and Amplitude directly is quite limited in data volumes, we usually recommend moving the data to a dedicated data warehouse. The process is simple, as both companies have a structured integration with data warehouses. 
    • Loops can support the ETL process of moving the data to your warehouse. Reach out for more details. 

    Mixpanel

    Currently, Mixpanel offers two solutions for exporting the raw events outside Mixpanel (and gaining control over your data):

      • Raw Export API – API end-point to download all events between two dates

      • Raw export pipelines – a scheduled export of all events managed by Mixpanel directly to destination buckets (AWS S3 or Google’s GCS).

      Raw Export API Raw Export Pipeline Export directly to DW
    Managed Non Managed – you need to build a pipeline yourself Partially Managed – you need to load the data from a bucket to your DW [1] Fully managed
    Data ownership You own your data You own your data You get a view access to a managed DW
    Pricing Free, Loops can manage this process for you [4] Available in Growth and Enterprise plans – required data pipelines add-on [2] [4] Available in Growth and Enterprise plans – required data pipelines add-on [2] 
    Pull Frequency (from Mixpanel to the data warehouse) You control, some limitation exists [3] Hourly / Daily Hourly / Daily

    [1] Loops can managed this step if you choose GCP and BigQuery

    [2] 30-day free trial exists

    [3] https://help.mixpanel.com/hc/en-us/articles/115004602563-Rate-Limits-for-API-Endpoints

    [4] Cloud data warehouse solutions incur costs

    Amplitude

      Raw Export API Amplitude Query  Amplitude ETL
    Managed Non Managed – you need to build a pipeline yourself, Loops can build that for you [1] Managed Snowflake data-warehouse offered by Amplitude. Basically, we’ll query the data warehouse behind Amplitude.  Partially Managed – you need to load the data from a bucket to your DW [1]
    Data ownership You own your data [4] You get access to a managed DW You own the data [4]
    Pricing Included in all plans Not included in the plans, additional costs incurred.  Included in all paid plans
    Pull Frequency (from Amplitude to the data warehouse) You control Hourly / Daily Hourly / Daily

    [1] Loops can managed this step if you choose GCP and BigQuery

    [2] Cloud data warehouse solutions incur costs 

    Google Analytics/Firebase

    Firebase/GA enables, with a click of a button, to move the analytics data to Big Query. To do so, you don’t need to be a paid customer of Firebase. 

    When you link your project to BigQuery, Firebase/GA exports a copy of your existing data to BigQuery export. Firebase/GA also sets up daily syncs of your data from your GA/Firebase project to BigQuery. 

    Loops can set up a Big Query account for you and move the Firebase/GA to Big Query. To do so, the steps:

    1. Set up a Google Cloud account – see instructions here (1 min process). Make sure to include billing details in the account. 
    2. Provide integrations@getloops.ai access to your Firebase/GA account. If you prefer to move the data yourself, please follow the export instructions

    Databricks

    We support two authentication methods – username/password (if SSO is disabled on your server) or using a Databricks API Token. 

    Details needed:

    • Token or username/password
    • Domain name of your Databricks deployment
    • Workspace ID
    • Cluster ID

    Databricks’ manual for generating API Token – https://docs.databricks.com/dev-tools/api/latest/authentication.html#token-management

    Databricks’ manual for getting needed information – https://docs.databricks.com/workspace/workspace-details.html 

    Segment

    You can use Segment to send the data to the data warehouse. There are two options to do so:

    1. Send the data to Loop’s data warehouse (which is based on Big Query). See details about the process.   
    2. Send the data to your data warehouse. Segment supports all the famous data warehouses.
      1. Loops can set up the data warehouse for you if you do not have a data warehouse yet.

    Pendo

    • Pendo supports integration into all the famous cloud data warehouses, like Big Query, Redshift, and Snowflake. This integration is available as a paid add-on for all Pendo subscriptions. 
    • Loops can set up the integration for you; reach out for details. 

    Heap

    • Heap offers Heap Connect, a managed ETL process that transfers the data to your Data warehouse within a few clicks of a button. Heap connect is a paid add-on available for free for “Premier” users. 
    • Loops can set up the integration for you; reach out for details. 

    Postgres

    • Access must be granted to Loops’ user account with the relevant access to the tables and views. 
    • Postgres resources are limited, so Loops’ analyses could be limited for companies with a massive user base. 
    • Loops does not run on your operational DB as it might affect performance, only on the analytics’ DB. 

    Vertica

    We would need access from our GCP (Google could) to your Vertica. Integration depends on Vertica’s implementation (on-prem or cloud). 

    Braze

    Getting events from Braze requires “Braze Currents” please get in touch with your sales rep in Braze to understand if your current plan allows for Braze Currents. Once available, three options are available: Using Snowflake, AWS S3 storage, and GCP Google Cloud Storage. Loops can move the data from Braze to your data warehouse: Big Query over GCP, Athena over S3, or Snowflake. 

    Make sure you can accurately join Braze data with product usage data by sending the product’s user Id to Braze. 

    Splunk

    Usually, we recommend transferring the data to a cloud data warehouse. Reach out for more details. 

    Posthog

    Posthog has an integration with all the famous cloud warehouses. If you host Posthog in your environment, it depends on the Posthog’s DB you’re using. Reach out for more details. 

    For any questions or additional support, please contact