I’m researching feature stores and feature engineering pipelines. The former is clear in most cases (both Hopsworks and Feast have good enough solutions), but I haven’t seen an explicit implementation of the latter which is the important use-case in my team. We are a small team and don’t want to maintain both an ETL pipeline in one system and a list of features in a feature store. Ideally both these things would be coupled.
https://youtu.be/0wfxWFaDG9Q (Data Engineering Melbourne Meetup- Jim Dowling 30th April 2020) in this video @Jim_Dowling says “this [feature engineering] can be done on Databricks or Sagemaker”. I’m not familiar with Databricks but Sagemaker seems more focused on batch processing of features from S3 – this is only useful when training a model, not while serving a model in production.
What is the common feature engineering system for live production systems, and how well would it integrate with Hopsworks?