Confused between PyPI/hopsworks and PyPI/hsfs

Hi folks, I followed a link to the “quickstart colab” and found this code snippet:

import hopsworks

project = hopsworks.login()
fs = project.get_feature_store()

trans_fg = fs.get_or_create_feature_group(
    name="transactions",
    version=1,
    description="Transaction data",
    primary_key=["cc_num"],
    event_time="datetime",
    online_enabled=True
)

and then in the FeatureTools tutorial I saw this other code snippet:

import hsfs

# create connection to HSFS
connection = hsfs.connection()
# load the default feature store
fs = connection.get_feature_store()

# initialize the feature group
fg = fs.create_feature_group("Demo Retail Data",
    version=1,
    description="Features created for demo retail data using FeatureTools",
    primary_key=['SK_ID_CURR', 'SK_ID_BUREAU', 'SK_ID_PREV', 'bureaubalance_index', 'cash_index', 'installments_index', 'credit_index'],
    online_enabled=True)

I see hopsworks depends on hsfs, so I assume the underlying objects belong to the latter, but a quick inspection of the code didn’t clarify this. In practical terms, which API should we use?

hi @astrojuanlu

As you mentioned, hopsworks library contains both hsfs and hsml. Both of the code snippet should return the same feature store class. However, it is recommended to use hopsworks if you use the serverless platform for easy login.

Kenneth

1 Like