I am trying to redeploy Hopsworks on a machine with ubuntu 18.04 - it works fine until I end up at Hopsworks::default when it hits an error:
[2021-07-06T13:35:28+00:00] ERROR: Running exception handlers Running handlers complete
[2021-07-06T13:35:28+00:00] ERROR: Exception handlers complete Chef Client failed. 6 resources updated in 14 seconds
[2021-07-06T13:35:28+00:00] FATAL: Stacktrace dumped to /tmp/chef-solo/chef-stacktrace.out
[2021-07-06T13:35:28+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2021-07-06T13:35:28+00:00] FATAL: RuntimeError: ruby_block[check_db_empty] (hopsworks::default line 197) had an error: RuntimeError: You are trying to initialize the database, but the database is not empty. Either there is a failed migration, or you forgot to set the current_version attribute
We got some hints about deleting the databases in mysql (the Hopsworks database) but it seems not to make any difference.
Any clues on how to get further?
(without a complete wipeout of the machine).
Can you please give a few more details on the installation.
Did you previously install Hopsworks on this machine with the hopsworks installer: Hopsworks Installer — Documentation 2.2 documentation ?
Did you try cleaning the machine of hopsworks artifacts before trying to re-install? Did you for example run the purge step from the documentation above? ./hopsworks-installer.sh -i purge -ni
If you try to run on the machine the command that failed, what output do you get? /srv/hops/glassfish/versions/current/bin/asadmin --host localhost --port 4848 --user adminuser --passwordfile /srv/hops/domains/domain1_admin_passwd --interactive=false --echo=true --terse=false set resources.managed-executor-service.concurrent/hopsExecutorService.thread-priority=10
Yes I did the purge before reinstalling. I did try that and it complained about a missing executorService (the hopsExecutorService). Not sure why it was missing, but that seems to be the case each time I deploy on a machine that already had a completed installation (even after purge).
(but now I did a reinstall and have it up-n-running but I will try later to get the same error printed).
Finally redid the failure - here is a copy of the log file (Hopsworks_default.log) - did not capture the absolute beginning - but I think I got the first errors) - and the last ones too… hopsworks__default.log.zip (6.7 KB)
To completely purge the installation shutdown all services using the script /srv/hops/kagent/kagent/bin/shutdown-all-local-services.sh. Then delete the installation directory, if you didn’t change it, it defaults to /srv/hops so do sudo rm -rf /srv/hops. Finally delete /etc/docker directory. If you have installed Kubernetes also delete /etc/kubernetes and the files in /home/kubernetes. Then you can retry the installation.
No, seems like I get the same issue:
“INFO [2021-09-02 16:13:21,457] se.kth.karamel.backend.machines.SshMachine: 10.10.124.23: Running task: hopsworks::default
INFO [2021-09-02 16:21:02,353] se.kth.karamel.backend.machines.SshMachine: 10.10.124.23: Running task: hopsworks::default
ERROR [2021-09-02 16:21:29,386] se.kth.karamel.backend.dag.DagNode: Failed ‘hopsworks::default on 10.10.124.23’ because '10.10.124.2
3: Command did not complete: mkdir -p /home/ubuntu/.karamel/install ; cd /home/ubuntu/.karamel/install; e”
Hopsworks::default still fails - probably at the same place.
Is there anything else that is installed by the scripts that is not ending up at the /srv/… ? Some Java-config or something else?
If you deleted /srv/hopsafter you have stopped all services then there isn’t any other place. What’s in the log /home/ubuntu/.karamel/install/hopsworks__default.log ?