Product · DataFlow Platform

The integration
engine
your data deserves.

DataFlow is an enterprise-grade data integration platform. Design complex pipelines visually, execute them with sub-millisecond precision, and monitor everything in real time.

Three subsystems.
One platform.

DataFlow is built around a high-performance execution engine, a visual flow designer, and a comprehensive monitoring system — all wired together through a unified API gateway.

Engine

Flow Engine

DAG-based pipeline executor with wave-parallel processing, conditional branching, sub-flows, data lineage tracking, and cron scheduling. Built on Tokio async runtime for maximum throughput.

Designer

Visual Designer

Drag-and-drop React canvas powered by React Flow. Auto-generated config forms from module JSON Schemas. Version history, diff viewer, one-click rollback, and credential management.

Observability

Monitoring Dashboard

Real-time execution waterfall, structured log viewer, system health indicators, and cancel actions. OpenTelemetry traces, Prometheus metrics, and pre-built Grafana dashboards.

API

API Gateway

Full REST API with RBAC enforcement, OIDC/SAML/local auth, JWT middleware, and an MCP server for AI assistant integration. All endpoints are typed, audited, and tested.

Security

Auth & Licensing

OIDC, SAML 2.0, and local auth with 16-permission RBAC. Ed25519-signed offline licenses — no call-home required, works in air-gapped networks.

AI

AI & LLM Integration

First-class Claude AI Assistant module for data analysis, document generation, and natural language flow design. Plus an AI Agent Gateway supporting OpenAI, Gemini, AWS Bedrock, Azure OpenAI, and self-hosted models — all with unified prompt templating, structured output, and agentic tool-use within flows.

49 modules.
Across 14 categories.

Every module implements the same FlowModule trait — chainable, configurable, and testable. The ecosystem spans file transfer, cloud storage, data processing, B2B/EDI, ERP/CRM, e-commerce, security, AI/LLM, and more. Build custom modules in Python via the module SDK, or distribute and install them through the module marketplace.

File Transfer & Transport
ModuleType IDDescription
SFTP Transfersftp-transferUpload/download via SFTP with key/password auth, glob patterns, post-download actions
AS2 Transferas2-transferSend/receive files via AS2 (EDI over HTTP) with MDN acknowledgment support
FTP Transferftp-transferUpload/download via FTP/FTPS (explicit and implicit TLS), active/passive modes, glob patterns
AS4 Transferas4-transferebMS 3.0 / AS4 for EU B2B (Peppol, e-SENS), WS-Security signing/encryption, receipt handling
OFTP Transferoftp-transferOFTP2 (RFC 5024) for automotive/manufacturing supply chains, TLS, EERP/NERP receipts, restart/recovery
Cloud Storage
ModuleType IDDescription
AWS S3cloud-s3Upload/download/list/delete S3 objects, multipart uploads, server-side encryption, presigned URLs, SQS/SNS triggers
Google Cloud Storagecloud-gcsUpload/download/list/delete GCS objects, IAM or HMAC key auth, lifecycle policies, Pub/Sub event trigger
Azure Blob Storagecloud-azure-blobUpload/download/list/delete blobs, Shared Key or SAS token auth, hot/cool/archive tiers, Event Grid trigger
Cloudflare R2cloud-r2S3-compatible API for Cloudflare R2 with zero egress fees, Workers integration
Data Processing & Transformation
ModuleType IDDescription
File Transformfile-transformCSV, JSON, XML, and fixed-width format conversions
File Routerfile-routerRoute data to different branches based on content rules or metadata
File Compressionfile-compressZIP, GZIP, and TAR compression and decompression
CSV Filecsv-fileParse, process rows, and generate CSV with per-row filtering and routing
XML Processorxml-processParse, transform (XSLT), validate (XSD), XPath extraction, generate. Streaming for large documents
JSON Processorjson-processParse, transform (JSONPath/JMESPath), validate (JSON Schema), merge, split, flatten/unflatten
Document Processordoc-processRead/write DOCX (template fill, mail merge), XLSX (formulas, named ranges), PDF (text extraction, form fill)
Security & Encryption
ModuleType IDDescription
PGP Encrypt/Decryptpgp-cryptoEncrypt, decrypt, sign, verify with rPGP 0.14. ASCII armor and binary output
AES Encrypt/Decryptaes-cryptoAES-128/192/256 in CBC, CTR, GCM, CCM modes. PKCS7 padding, configurable IV, Base64 or binary output
3DES Encrypt/Decrypt3des-cryptoTriple DES in CBC and ECB modes for legacy system interoperability (banking, POS, FIPS 46-3)
Hash & Checksumhash-digestMD5, SHA-1/256/384/512, SHA3, CRC32, Blake2b, Blake3. hash_file, verify_hash, hash_manifest operations
HMAChmac-authHMAC-SHA256/384/512/SHA3 for webhook signature verification (Shopify, Stripe, QuickBooks) and B2B integrity
S/MIME Encrypt/Signsmime-cryptoS/MIME (RFC 8551) encryption and signing with X.509 certs. Integrates with platform certificate manager
Communication & Notification
ModuleType IDDescription
Web API Callhttp-requestHTTP requests with Bearer, Basic, ApiKey, OAuth2, and mTLS auth types; template substitution
Email Notificationemail-notifySend SMTP email notifications on flow events or pipeline results
Database Querydb-queryExecute queries against PostgreSQL, MySQL, MSSQL, Oracle with per-row processing
ODBC Databaseodbc-connectODBC driver-based connectivity for any database with an ODBC driver (DB2, Informix, Teradata, Snowflake, SAP HANA, etc.). Parameterized queries, bulk insert, connection pooling
EDI & B2B Integration
ModuleType IDDescription
EDI Translatoredi-translateParse/generate X12 EDI with 14 transaction sets (830, 832, 850, 855, 856, 810, 860, 879, 888, 889, 997, 852, 940, 945) with retailer-specific profiles
ERP / CRM / Accounting
ModuleType IDDescription
QuickBooks Connectorquickbooks-connectRead/write QuickBooks Online (OAuth2 REST) and Desktop (QBXML) — invoices, POs, items, customers
SAP Connectorsap-connectSAP ERP and S/4HANA via RFC/BAPI or OData APIs. IDocs inbound/outbound, sales orders, material master
Sage Connectorsage-connectSage 50/200/Intacct REST API — invoices, purchase orders, GL entries, inventory, customer/vendor records
Salesforce Connectorsalesforce-connectSalesforce REST/Bulk API with OAuth2 — SOQL queries, sObject CRUD, bulk import/export, Platform Events
E-Commerce Integrations
ModuleType IDDescription
Shopify Connectorshopify-connectShopify Admin REST/GraphQL — orders, products, inventory, fulfillments. Webhook-triggered flows
Instacart Connectorinstacart-connectInstacart Connect API — catalog sync, order ingestion, availability updates, delivery status tracking
DoorDash Connectordoordash-connectDoorDash Drive API — delivery creation, status tracking, order integration, menu/catalog sync
Spreadsheet & Document Connectors
ModuleType IDDescription
Spreadsheet Connectorspreadsheet-connectRead/write Excel (.xlsx/.xls) and Google Sheets with column mapping and header auto-detection
Automation & Scripting
ModuleType IDDescription
Script Runnerscript-runnerExecute sandboxed Python or shell scripts within a flow
File Watcherfile-watcherTrigger flows based on file system events
AI & LLM Integration
ModuleType IDDescription
Claude AI Assistantclaude-aiAnthropic Claude integration for data analysis, document generation, content classification, field extraction, and natural language flow design assistance
AI Agent Gatewayai-agentMulti-provider AI module supporting OpenAI, Google Gemini, AWS Bedrock, Azure OpenAI, and self-hosted models (Ollama, vLLM). Text generation, structured output, embeddings, and agentic tool-use
Cloud Data Warehouses
ModuleType IDDescription
BigQuery Connectorbigquery-connectGoogle BigQuery read/write. SQL queries, streaming inserts, batch load from GCS, table/dataset management, schema inspection. OAuth2 or service account auth
Snowflake Connectorsnowflake-connectSnowflake read/write. SQL queries, COPY INTO for bulk load/unload, warehouse/database/schema management, stage operations. Key-pair or OAuth auth
Redshift Connectorredshift-connectAmazon Redshift read/write. SQL queries, COPY/UNLOAD for S3-based bulk operations, Redshift Serverless support. IAM or username/password auth
Azure Synapse Connectorsynapse-connectAzure Synapse Analytics read/write. SQL queries, COPY INTO for ADLS-based bulk operations, dedicated and serverless pools. Azure AD or SQL auth
Databricks Connectordatabricks-connectDatabricks SQL Warehouse and Delta Lake read/write. SQL queries, Unity Catalog integration, Delta table operations. PAT or OAuth auth
Cloud Pub/Sub & Messaging
ModuleType IDDescription
Google Pub/Subgcp-pubsubPublish and subscribe to Google Cloud Pub/Sub topics. Batch publish, pull/streaming-pull subscriptions, dead-letter handling, ordering keys, message filtering
AWS SNS/SQSaws-messagingPublish to SNS topics, send/receive SQS messages. FIFO and standard queues, dead-letter queues, message attributes, batch operations, long polling
Azure Service Busazure-servicebusSend/receive Azure Service Bus messages. Queues and topics/subscriptions, sessions, dead-letter, scheduled messages, peek-lock/receive-and-delete modes
Apache Kafkakafka-connectProduce and consume Apache Kafka messages. Topic management, consumer groups, offset management, schema registry integration (Avro/JSON Schema), exactly-once semantics
IoT & Edge Messaging
ModuleType IDDescription
MQTT Clientmqtt-connectPublish and subscribe to MQTT brokers (v3.1.1 and v5.0). QoS levels 0/1/2, retained messages, last will, topic wildcards, TLS/mTLS, WebSocket transport

Real-world flows,
ready out of the box.

DataFlow ships with pre-built flow templates for common B2B scenarios — demonstrating how the engine's modules chain together in production. Retail EDI is one powerful example: connecting suppliers to Walmart, Amazon, Target, and Costco over AS2 with full X12 support. The same pipeline engine handles any industry, any protocol, any data shape.

Pre-built Flow Templates

Walmart · Inbound PO
AS2
Receive
EDI
Parse 850
QuickBooks
Write Order
EDI
Gen 997
AS2
Send
Walmart · Invoice
QuickBooks
Read Invoice
EDI
Gen 810
AS2
Send
Wait
For 997
Amazon · Vendor Central PO
HTTP
Poll API
EDI
Parse 850
Spreadsheet
Write Excel
Email
Notify
3PL Ship Confirmation
AS2
Receive (3PL)
EDI
Parse 945
EDI
Gen 856
AS2
Send Retailer

Supported Transaction Sets

  • 830 — Planning Schedule (Both)
  • 832 — Price/Sales Catalog (Both)
  • 850 — Purchase Order (Both)
  • 855 — PO Acknowledgment (Both)
  • 856 — Advance Ship Notice (Both)
  • 810 — Invoice (Both)
  • 860 — PO Change Request (Both)
  • 879 — Price Information (Both)
  • 888 — Item Maintenance (Both)
  • 889 — Promotion Announcement (Both)
  • 997 — Functional Acknowledgment (Both)
  • 852 — Product Activity Data (Both)
  • 940 — Warehouse Ship Order (Both)
  • 945 — Warehouse Shipping Advice (Both)

Trading Partner Features

  • Centralized partner profile store
  • AS2 endpoint & certificate management
  • Atomic control number sequencing
  • Interchange tracking & acknowledgment status
  • Overdue 997 detection & alerting
  • Retailer-specific profiles (Walmart, Amazon, Target, Costco)
  • Document chain queries (PO → Invoice → ASN)

Execution that
never misses a beat.

The DataFlow engine is built for production. Every execution detail — waves, conditions, sub-flows, lineage — is tracked, stored, and visible.

Parallel

Wave-Based Execution

Independent steps are grouped into waves via DAG topological analysis. Each wave executes concurrently up to a configurable max_parallel limit using Tokio JoinSet.

Conditions

Conditional Branching

10 condition variants: Expression, Equals, NotEquals, GreaterThan, LessThan, Exists, All, Any, Not, Always. Template variable resolution from previous step outputs.

Sub-flows

Composable Sub-flows

Flows can invoke other flows as sub-flows with full parent chain tracking, depth limits, and cycle detection. Enables modular pipeline design at scale.

Lineage

Data Lineage Tracking

Every DataProduced, DataConsumed, DataTransformed, SubFlowLink, ExternalSource, and ExternalSink event is tracked. Upstream and downstream traces via recursive CTE queries.

Retry

Resilient Retry Logic

Configurable exponential backoff per step. Fail-fast propagation stops downstream steps on error. Cancellation checks between waves for clean shutdown.

Schedule

Cron Scheduling

Cron-based flow triggering with timezone support. Preview next run times via API. Full schedule lifecycle management through the UI or API.

Deploy your way.
Marketplace or self-hosted.

DataFlow is available on AWS and Google Cloud Marketplace with one-click deployment and integrated billing — or self-host on your own infrastructure with full control. Pricing is based on the number of active data flows: Core ($2,000/yr · 5 flows), Standard ($3,750/yr · 15 flows), Professional ($9,500/yr · 40 flows), Enterprise (from $15,000/yr · unlimited). No per-row or per-execution charges. See full licensing details →

AWS Marketplace

Deploy via a pre-built AMI, EKS Helm chart, or CloudFormation quick-start template. Integrated billing through your AWS account with SaaS subscriptions and annual commit options.

Google Cloud Marketplace

One-click GKE Autopilot deployment with Cloud SQL, load balancing, and managed certificates. Billing through your Google Cloud account via the Procurement API.

Kubernetes (Helm)

Production Helm chart with HPA, PDB, NetworkPolicy, ServiceMonitor, and Ingress with TLS. Deploy to any K8s cluster.

AWS / GCP / Azure

Terraform modules for EKS, GKE, and AKS — each paired with a managed PostgreSQL instance. One command to provision a full production environment.

Linux (Bare Metal)

Interactive and unattended systemd installer. Minimal footprint, no external dependencies.

Windows (Bare Metal)

NSSM-based Windows service installer. Runs alongside existing Windows infrastructure.

Offline. Tiered.
HA-ready.

DataFlow uses Ed25519-signed offline licenses — no call-home, no license server, and full support for air-gapped deployments. Four license tiers gate module access at design time and runtime. Native HA deployment modes support Production/DR pairs and Active-Active clusters.

Offline · Ed25519

No Call-Home Required

All license verification is done locally using Ed25519 signature verification against an embedded public key. Works in air-gapped networks and classified environments without any external connectivity.

4 Tiers

Per-Module Tier Gating

Core → Standard → Professional → Enterprise. Each module declares its minimum tier via min_license_level(). The engine enforces tier checks at flow save, execution time, and in the module listing UI.

HA Modes

Deployment-Mode Licensing

License keys encode HA role: Standalone, Production (active primary), Disaster Recovery (standby, reduced pricing), or Active-Active (full price per node). The engine enforces execution restrictions for DR nodes until promoted.

Full Licensing & HA Details

Be among the first to
run DataFlow.

Request Early Access Integration Services
MODULES 30+ THROUGHPUT 2.4M events/s LATENCY 0.8ms UPTIME 99.99% ACTIVE FLOWS 847 DATA PROCESSED 18.3TB EDI TRANSACTIONS 12.1K MODULES 30+ THROUGHPUT 2.4M events/s LATENCY 0.8ms UPTIME 99.99% ACTIVE FLOWS 847 DATA PROCESSED 18.3TB EDI TRANSACTIONS 12.1K