Error saving pandas dataframe in feature group

Hello!

I’m a new Hopsworks user. I’m trying to ingest some data to a feature group. I’m working from a conda environment in VSCode I’m able to create a connection using the project name and API key:

conn = hsfs.connection(
    project = HOPSWORKS_PROJECT_NAME,
    api_key_value = HOPSWORKS_API_KEY
)

and get the feature store:

fs = conn.get_feature_store()

Creating a feature group is working:

fg = fs.create_feature_group(
    name='some_name',
    version=1,
    description='some_description',
    primary_key=['some_id'],
    online_enabled=True
)

But when I try to save the dataframe:

fg.save(df)

an error is raised (everything seems OK at first, but after several seconds, the error is raised):

The feature group and the features are shown in the app, but there’s no data.

I’d appreciate any help to understand what’s going on.

Thank you!

Hi @jacasta2,

Looks like the ingestion job failed, you can have a look at the logs of the job in the in the UI > Ingestion Job > nuggets_2016_1_offline_fg_backfill and click on the Logs button on the failed execution.

In any case, I can take a look at the logs for you and let you know what’s the issue. I’ll let you know asap.


Fabio

Hello, @Fabio,

Thanks for the reply. I was writing a reply when I saw you were replying.

A quick update… The data DO seem to be there: when opening the feature group, under the tab Data preview. I guess I was checking in the wrong place (clicking the button inspect data).

The error is gone by adding the argument:

write_options={"start_offline_backfill": False}

when calling the method (I saw this in a tutorial repo that uses Hopsworks).

At any rate, I don’t know what causes the error and whether it makes sense that such error is raised.

Here are the logs: logs

I’d appreciate any help to understand this.

Thanks!

Hi @jacasta2 - Seems like the issue is related to serializing/deserializing your name - Sorry about that :frowning: . I’m not yet sure why as the job is running with UTF-8 encoding.

I managed to reproduce the issue on my development environment and I’m investigating more, I’ll keep you posted.

1 Like

Hi @jacasta2
Thank you for your patience, this issue should be solved now. Can you try to run the your job again?

Regards,
Fabio

Hello, @Fabio,
I’ll take a look early next week, I’m leaving for the weekend. I’ll let you know when I run the job.
Thanks a lot!

1 Like

Hello, @Fabio!

I ran the job and no error was raised!

I don’t know if I’m going to understand it, but what was the issue?

Thanks a lot!

JA