Open Source ML Platform

Feature Factory

Machine learning for the masses. Feature engineering, model training, scoring, deployment, and governance, all running inside your existing data warehouse. No data movement, no external ML platform, no specialist ML team required.

Coming soon. Join the waitlist to be notified when available.

ML First. In-Database. On the Data You Already Have.

Most enterprise ML deployments start with a problem that has nothing to do with the model: getting the data out of the warehouse and into an external ML platform. That movement is slow, expensive, and creates a separate data governance problem.

Feature Factory takes a different approach. It runs feature engineering, model training, scoring, and deployment directly inside your existing database, using the processing power of your existing data platform. Your data never moves. Your governance controls remain in place. Your ML pipeline runs at warehouse scale without a separate platform.

The philosophy is pragmatic: classical machine learning on well-governed data, applied to the core business decisions that drive revenue, manage risk, and improve operations. Feature Factory is not an AI hype vehicle; it is an engineering tool for getting reliable ML into production on the data you already have.

No Data Movement
Features and models run inside your database; data stays in the warehouse
Multi-Platform
PostgreSQL, Greenplum/Cloudberry, IBM® Netezza®, Databricks; use the processing power of your existing data platform
Governance Included
Full audit trail, data lineage, GDPR/CCPA/EU AI Act compliance automation built in
Open Core + Commercial
Open-source core with Community, Professional, and Enterprise tiers

The Platform at a Glance

What Feature Factory Provides

Feature Engineering Engine

Define and compute ML features using in-database SQL pushdown. Manage feature sets with full versioning and a feature store. Features are computed inside your data warehouse; no extraction required.

Model Management & AutoML

Train, version, deploy, and run predictions using in-database analytics. AutoML recommends algorithms (Decision Trees, K-Means, GLM). Experiment tracking and full model history are maintained automatically.

Data Quality Engine

Automated table and column profiling with quality scoring across completeness, distribution, and pattern validation. Quality rule definitions execute on schedule and report against configurable thresholds.

Governance Engine

Business domain hierarchy, data stewardship, business glossary, and lineage tracking. Impact analysis shows downstream effects of changes. Full audit trail for every operation on governed data assets.

Drift Detection Engine

Monitors feature and data drift over time using PSI (Population Stability Index) metrics. Configurable alert thresholds notify you when data distributions shift enough to affect model reliability.

Compliance Automation Engine

GDPR, CCPA, and EU AI Act compliance built in. Handles DSAR management, consent tracking, automated erasure workflows, ROPA, and DPIA documentation — with email mailbox polling for incoming requests.

Geospatial Analytics Engine

Spatial feature engineering with distance calculations, region lookups, spatial aggregations, and UK postcode geocoding with LSOA/MSOA lookups for demographic analysis.

Demographic Engine

Enriches data with UK demographic profiles (LSOAs: population, age, income, OAC classification). Supports multi-country lookups including world cities and country centroids. Demographic feature templates accelerate common enrichment patterns.

Built for Regulated Industries

Feature Factory includes built-in compliance automation for GDPR, CCPA, and the EU AI Act — the regulatory frameworks most relevant to organisations using ML on customer and behavioural data.

These are not add-ons. Compliance functionality is built into the platform from the ground up, covering the operational workflows (DSAR, erasure, consent) and the documentation requirements (ROPA, DPIA, AI system documentation) that regulated industries need.

Compliance automation covers:

  • DSAR (Data Subject Access Request) management with email mailbox polling
  • Granular consent tracking by subject and purpose
  • Lawful basis registry with legislation mapping
  • Automated Right to Be Forgotten erasure workflows
  • ROPA (Records of Processing Activities) management
  • DPIA (Data Protection Impact Assessment) documentation
  • EU AI Act system documentation
  • Full audit trail with tamper-evident logging

Community, Professional, and Enterprise

Feature Factory is licensed per database instance. The Community tier is free, with a delayed release schedule. Professional and Enterprise tiers provide current releases and additional support.

Community
Free
6-month release delay

Evaluation & Development

For developers, non-profit organisations, academic use, and teams evaluating the platform before a commercial deployment.

  • All engines included
  • 6-month delayed releases
  • Community support
Enterprise
£5,000
per month, per database instance

Mission-Critical

For organisations requiring immediate access to current releases, around-the-clock support, and roadmap influence.

  • All engines included
  • Immediate GA releases
  • Critical bug fixes: immediate
  • 24×7 support via dedicated help desk + email
  • Priority SLAs (P1: 2hr, P2: 4hr)
  • Direct input into roadmap priorities

Bring ML to Your Data

We are porting Feature Factory to PostgreSQL, Greenplum/Cloudberry, Databricks, and beyond, and open-sourcing everything. Join the waitlist to be the first to know when it is available, or get in touch to discuss whether Feature Factory is a fit for your environment.

Join the Waitlist

Takes under a minute – just your name and email; everything else is optional. You will get an immediate confirmation, and we will notify you the moment Feature Factory is available for your platform.

Your data is processed by Smart Associates Limited in accordance with our Privacy Policy and will never be shared with third parties.

Want to discuss licensing or suitability? Contact us directly

Common Questions

The Platform
  • Feature Factory is an open-source, in-database machine learning platform from Smart Associates. It includes eight engines covering feature engineering, model training and AutoML, data quality, governance, drift detection, compliance automation (GDPR/CCPA/EU AI Act), geospatial analytics, and demographic enrichment — all running inside the existing data warehouse without data movement.

  • Feature Factory uses in-database SQL pushdown to execute feature engineering and model training directly within the data warehouse engine. On IBM Netezza, it leverages the INZA in-database analytics framework. Data stays in the warehouse, so existing governance controls remain in place and no external ML infrastructure is required.

  • Feature Factory currently ships on IBM Netezza with INZA. Smart Associates is porting the platform to PostgreSQL, Greenplum/Cloudberry, and Databricks, with open-source release planned. Join the waitlist at smart-associates.biz/products/feature-factory/ to be notified when new platforms become available.

  • Feature Factory is licensed per database instance in three tiers. Community is free with a six-month release delay, suitable for evaluation and non-commercial use. Professional is £1,500 per month per database instance with 9×5 support. Enterprise is £5,000 per month per database instance with 24/7 support and priority SLAs (P1: 2-hour response).

Pricing & Availability
  • The Feature Factory Community tier is a free licence that includes all eight engines with a six-month release delay relative to the current release. It is designed for developers, academic users, non-profit organisations, and teams evaluating the platform before a production deployment.

  • Feature Factory’s AutoML engine supports Decision Trees, K-Means clustering, and Generalised Linear Models (GLM) within its in-database execution model. AutoML benchmarks multiple algorithms and recommends the best performing model for the dataset and use case, maintaining full experiment history automatically.

  • Feature Factory includes a dedicated Compliance Automation Engine covering GDPR, CCPA, and EU AI Act requirements. It manages DSAR (Data Subject Access Requests) via email mailbox polling, consent tracking, lawful basis registry, Right to Be Forgotten erasure workflows, ROPA management, DPIA documentation, and EU AI Act system documentation.

  • The Drift Detection Engine in Feature Factory monitors feature and data distributions over time using Population Stability Index (PSI) metrics. When a data distribution shifts beyond a configurable threshold, the engine raises an alert — indicating that model reliability may be affected and retraining should be considered.

Technical Details
  • No. Feature Factory is designed to run inside the existing data warehouse without requiring a separate ML platform, data pipeline, or specialist ML engineering team. Data engineers familiar with SQL can work with the platform using guided workflows, though data science expertise benefits model design.

  • The Geospatial Analytics Engine in Feature Factory supports spatial feature engineering including distance calculations, region lookups, spatial aggregations, and UK postcode geocoding with LSOA and MSOA demographic lookups. This enables geographic segmentation and spatial ML features without external GIS tools.

  • The Feature Factory Demographic Engine enriches data with UK demographic profiles from Lower Super Output Area (LSOA) classifications — covering population, age distribution, income, and OAC categories. It also supports multi-country lookups for world cities and country centroids, with demographic feature templates to accelerate common enrichment patterns.

  • Feature Factory is currently in development for general availability. The platform has been running on IBM Netezza with INZA. The open-source release and expanded platform support are pending. Join the waitlist at smart-associates.biz/products/feature-factory/ to be the first to know when it is available.