Error installing Ubuntu 18.04

Hey guys,
I was wondering if you can help me with this error. I’m trying to install on Ubuntu server 18.04
ERROR [2020-05-26 20:30:03,321] se.kth.karamel.backend.dag.DagNode: Failed ‘make solo.rb on 10.216.0.155’ because ‘10.216.0.155: Command did not complete: mkdir -p /home/troy/.karamel/install ; cd /home/troy/.karamel/install; echo $$ > pid; echo ‘#!/bin/bash
set -eo pipefail
sudo touch solo.rb
sudo chmod 777 solo.rb
cat > solo.rb <<-‘END_OF_FILE’
file_cache_path “/tmp/chef-solo”
cookbook_path ["/home/troy/.karamel/cookbooks/hopsworks-chef_vendor"]
END_OF_FILE’ > make_solo_rb.sh ; chmod +x make_solo_rb.sh ; ./make_solo_rb.sh’, DAG is stuck here :frowning:
INFO [2020-05-26 20:30:04,656] se.kth.karamel.backend.machines.MachinesMonitor: Sending pause signal to all machines
^C

Thanks for any help :slight_smile:

Hi!

karamel needs to be able to ssh into all machines - even localhost. Can you try to ssh localhost?

Hi,

Yup, ssh localhost works alright

Can you please check your chef version with chef --version ?
chef development kit should be 3.7 and chef-client 14

If not you can download it from here

Hey,
So I’ve got smth like this

troy@vm00394:~$ chef --version
Chef Development Kit Version: 3.7.23
chef-client version: 14.10.9
delivery version: master (64f556d5ebfd7bac2c0b3cc2c53669688b3ea4b5)
berks version: 7.0.7
kitchen version: 1.24.0
inspec version: 3.4.1

Your chef version is correct. Are you using hopsworks-installer.sh script or running Karamel directly?
If the user who performs the installation requires a password to get sudo privileges then you need to specify it. If you’re using the installer script use --password argument.

Also, you may try running /home/troy/.karamel/install/make_solo_rb.sh manually to get more info.

Yeah I’m using hopsworks-installer.sh. As it’s described in the docs I created a not root user with sudo access and running script from him.
So, I’ve just tried –password argument the result is the same :frowning:

INFO  [2020-05-28 15:59:34,640] se.kth.karamel.backend.machines.SshMachine: 10.216.0.155: Running task: make solo.rb
ERROR [2020-05-28 15:59:40,902] se.kth.karamel.backend.dag.DagNode: Failed 'make solo.rb on 10.216.0.155' because '10.216.0.155: Command did not complete: mkdir -p /home/troy/.karamel/install ; cd /home/troy/.karamel/install; echo $$ > pid; echo '#!/bin/bash
set -eo pipefail
echo "%password_hidden%" | sudo -S  touch solo.rb
echo "%password_hidden%" | sudo -S  chmod 777 solo.rb
cat > solo.rb <<-'END_OF_FILE'
file_cache_path "/tmp/chef-solo"
cookbook_path ["/home/troy/.karamel/cookbooks/hopsworks-chef_vendor"]
END_OF_FILE' > make_solo_rb.sh ; chmod +x make_solo_rb.sh ; ./make_solo_rb.sh', DAG is stuck here :(
INFO  [2020-05-28 15:59:43,803] se.kth.karamel.backend.machines.MachinesMonitor: Sending pause signal to all machines
^C
troy@vm00394:~$ cd /home/troy/.karamel/install/
troy@vm00394:~/.karamel/install$ ls -l
total 86996
-rwxrwxr-x 1 troy troy     1214 May 28 18:59 aptget.sh
-rw-rw-r-- 1 troy troy 89050222 Feb 17 18:07 chefdk_3.7.23-1_amd64.deb
-rwxrwxr-x 1 troy troy     1019 May 28 18:59 install-chefdk.sh
-rwxrwxr-x 1 troy troy      263 May 28 18:59 make_solo_rb.sh
-rw-rw-r-- 1 troy troy        8 May 28 18:59 ostype
-rwxrwxr-x 1 troy troy      605 May 28 18:59 ostype.sh
-rw-rw-r-- 1 troy troy        5 May 28 18:59 pid
-rwxrwxrwx 1 root root      103 May 28 18:52 solo.rb
-rw-rw-r-- 1 troy troy       47 May 28 18:59 succeed_list
troy@vm00394:~/.karamel/install$ ./make_solo_rb.sh
troy@vm00394:~/.karamel/install$
troy@vm00394:~/.karamel/install$
troy@vm00394:~/.karamel/install$ groups troy
troy : troy sudo
troy@vm00394:~/.karamel/install$
troy@vm00394:~/.karamel/install$

Just running ./make_solo_rb.sh I think works ok. It doesn’t show any errors at least

Hi,
could you please check if there is any log /home/troy/.karamel/Hops/logs/10.216.0.155/make_solo_rb.log ?

Do you have disk space in the home directory? (you can check with df -h)

One more thing, we just fixed the hopsworks-installer.sh to install the latest stable version (Hopsworks 1.3) - Could you please re-download the installer from here: https://raw.githubusercontent.com/logicalclocks/karamel-chef/1.3/hopsworks-installer.sh


Fabio

Hi Fabio,
Before using the new installer I checked and there was no /home/troy/.karamel/Hops/logs/10.216.0.155/make_solo_rb.log. Even /home/troy/.karamel/Hops/logs/10.216.0.155/ folder didn’t exist

Then I used the new installer.

  1. it shows this error
manycoloredheaven@gmail.com
Registering hopsworks instance....
curl: (7) Failed to connect to snurran.sics.se port 8443: Connection timed out

I could carry on with installation after it.
2) When installation started and I checked the logs with tail -f installation.log
I see following:

***********************************************************************************************************

Installation has started, but may take 1 hour or more..........

The Karamel installer UI will soon start at:  http://10.216.0.155:9090/index.html
Note: port 9090 must be open for external traffic and Karamel will shutdown when installation finishes.

=====================================================================

You can view the installation logs with this command:

tail -f installation.log

***********************************************************************************************************
troy@vm00394:~$ tail -f installation.log
usage: karamel
 -headless                Launch Karamel from a headless server (no
                          terminal on the server).
 -help                    Print help message.
 -launch <yamlFile>       Karamel cluster definition in a YAML file
 -passwd <sudoPassword>   Sudo password
 -scaffold                Creates scaffolding for a new Chef/Karamel
                          Cookbook.
 -server <yamlFile>       Dropwizard configuration in a YAML file

Not sure if this is ok or not. Don’t see anything else in the logs :frowning:

Regarding the disk space:

troy@vm00394:~$ df -h
Filesystem                         Size  Used Avail Use% Mounted on
udev                                16G     0   16G   0% /dev
tmpfs                              3.2G  1.1M  3.2G   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv  146G  4.0G  136G   3% /
tmpfs                               16G     0   16G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                               16G     0   16G   0% /sys/fs/cgroup
/dev/loop0                          94M   94M     0 100% /snap/core/9066
/dev/loop1                          89M   89M     0 100% /snap/core/7270
/dev/sda2                          976M  145M  765M  16% /boot
tmpfs                              3.2G     0  3.2G   0% /run/user/0
troy@vm00394:~$

I think 136G is more than enough right?

Hi. The “Registering hopsworks instance…” happened because we had a power outage for our server, back up now.
Try re-downloading the script
wget https://raw.githubusercontent.com/logicalclocks/karamel-chef/1.3/hopsworks-installer.sh
chmod +x hopsworks-installer.sh
./hopsworks-installer.sh

First clean up whatever has been installed
./hopsworks-installer.sh -i purge -ni

Then re-run the script.

When you run “tail -f installation.log” and it fails with your error message above, it is because an invalid parameter was passed to the karamel command. The actual command that is run to install hopsworks is this (where XXXYYY is your sudo password):
cd karamel-0.6
setsid ./bin/karamel -headless -launch …/cluster-defns/hopsworks-installer-active.yml -passwd XXXYYY > …/installation.log 2>&1 &

Hi Jim,

Thanks for the explanation.
I’ve tried the installation again today. I’m not what I didn’t differently but this time installation asked me for the sudo user password after this step what hadn’t happened before:

Installing Karamel...

Press ENTER to continue
--2020-06-01 12:41:24--  http://www.karamel.io/sites/default/files/downloads/karamel-0.6.tgz
Resolving www.karamel.io (www.karamel.io)... 193.10.67.171
Connecting to www.karamel.io (www.karamel.io)|193.10.67.171|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 32377518 (31M) [application/x-gzip]
Saving to: ‘karamel-0.6.tgz’

karamel-0.6.tgz                                             100%[========================================================================================================================================>]  30.88M  20.1MB/s    in 1.5s

2020-06-01 12:41:25 (20.1 MB/s) - ‘karamel-0.6.tgz’ saved [32377518/32377518]

sudo: a password is required

It appears you need a sudo password for this account.
Enter the sudo password for troy:

[sudo] password for troy:
Running command from /home/troy/karamel-0.6:

So I didn’t see the previous error.
Unfortunately, I see this error now:

INFO  [2020-06-01 10:15:58,052] se.kth.karamel.backend.machines.SshMachine: 10.216.0.155: Running task: hops::docker_registry
ERROR [2020-06-01 10:16:15,891] se.kth.karamel.backend.dag.DagNode: Failed 'hops::docker_registry on 10.216.0.155' because '10.216.0.155: Command did not complete: mkdir -p /home/troy/.karamel/install ; cd /home/troy/.karamel/install; echo $$ > pid; echo '#!/bin/bash
set -eo pipefail
echo $(date '+%H:%M:%S'): 'hops__docker_registry' >> order
cat > hops__docker_registry.json <<-'END_OF_FILE'
{
 "hopsmonitor": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "alertmanager": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "purge_telegraf": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "prometheus": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "node_exporter": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "ndb": {
   "mysqld": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "ndbd": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "mgmd": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     },
     "public_key": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDKpBCgHaWPXVQXKvKj2xDmOUS+4/Kt+X3odWRrwTQxLTGnOFO4Ahup5yvaKQRPSBjhpeN50R8aUlxf7aJDaTDTnBgRKCbd3+CyY+MANcBbgd/8j/vxP/aPG/HisYzjiL4XWwrdEy55EknLF/iTUqLvh7TNbmTVJzG/ApKivRPu3I99jlVNObD+eFst/dkYLfiNg69T9adD3jvrnu4IN6q8f/mrB6YXZFW614qkXaly+WtvY6Hl8av7yrELMsj6Ij1VMZszE7XMQaY06X4MDuj5pVDMQY/9wwjPD+L1n5ilguEQZXEg3fL5ObHuVi0dmmkNjBm2Hlku+MutYs1pEnpz mysql@vm00394n"
   }
 },
 "flink": {
   "yarn": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "historyserver": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "hopslog": {
   "_filebeat-spark": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "_filebeat-serving": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "_filebeat-beam": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "_filebeat-kagent": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "kagent": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "python_conda_versions": "3.6"
 },
 "consul": {
   "master": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "hops": {
   "rm": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "ndb": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "nn": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "nm": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "dn": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "docker_registry": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "jhs": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "tls": {
     "enabled": "false"
   },
   "yarn": {
     "detect-hardware-capabilities": "false",
     "memory_mbs": "29696",
     "vcores": "3",
     "cgroups_strict_resource_usage": "false"
   },
   "rmappsecurity": {
     "actor_class": "org.apache.hadoop.yarn.server.resourcemanager.security.DevHopsworksRMAppSecurityActions"
   }
 },
 "hops_airflow": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "sqoop": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "kzookeeper": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "epipe": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "elastic": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "opendistro_security": {
     "logstash": {
       "password": "74154dae_201",
       "username": "logstash"
     },
     "epipe": {
       "username": "epipe",
       "password": "74154dae_201"
     },
     "admin": {
       "username": "admin",
       "password": "74154dae_201"
     },
     "audit": {
       "enable_transport": "false",
       "enable_rest": "true"
     },
     "jwt": {
       "exp_ms": "1800000"
     },
     "kibana": {
       "password": "74154dae_201",
       "username": "kibana"
     },
     "elastic_exporter": {
       "username": "elasticexporter",
       "password": "74154dae_201"
     }
   }
 },
 "hadoop_spark": {
   "historyserver": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "certs": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "yarn": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "hive2": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   },
   "mysql_password": "74154dae_203"
 },
 "kkafka": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "tensorflow": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "conda": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "livy": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     }
   }
 },
 "hopsworks": {
   "default": {
     "private_ips": [
       "10.216.0.155"
     ],
     "public_ips": [
       "10.216.0.155"
     ],
     "private_ips_domainIds": {
       "10.216.0.155": "0"
     },
     "hosts": {
       "10.216.0.155": "10.216.0.155"
     },
     "public_key": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6CdGfAs8buIa8t1SJEO6dnnK/pQnNGRBH2aMw85Sz/qgCesq6184BffzzEUDBvz+5GjbKKDnFkPl23NouAYE6T5YcG2nFGJzd5NfGLNA7XAKPb1XBAi4jWD4SHuq2SJviebRyKDr8VxLTGpoP5CXzIhsQB9G3DMm//P39l/HzFocAxSLAkWKIePWiTU6CYk6yhUvQqWNTl7JW/+jmvZ8qTWG2YzJKGJeaFuzH5pHlqi+295nwp2GtoFqtzSVsJyBGceRpIqm+lnpo2p/cebdyPZGhYVyAHPFS/rSIi0uIT3lvFZuj8y4q1axdEY7/iuPqO2dvsXsfB3QtWB62oNoJ glassfish@vm00394n"
   },
   "application_certificate_validity_period": "6d",
   "kagent_liveness": {
     "threshold": "40s",
     "enabled": "true"
   },
   "requests_verify": "false",
   "featurestore_online": "true",
   "admin": {
     "password": "74154dae_201",
     "user": "adminuser"
   },
   "encryption_password": "74154dae_001",
   "master": {
     "password": "74154dae_002"
   },
   "https": {
     "port": "443"
   }
 },
 "install": {
   "kubernetes": "false",
   "dir": "/srv/hops",
   "cloud": "on-premises"
 },
 "mysql": {
   "password": "74154dae_202"
 },
 "alertmanager": {
   "email": {
     "to": "sre@logicalclocks.com",
     "smtp_host": "mail.hello.com",
     "from": "hopsworks@logicalclocks.com"
   }
 },
 "prometheus": {
   "retention_time": "8h"
 },
 "private_ips": [
   "10.216.0.155"
 ],
 "public_ips": [
   "10.216.0.155"
 ],
 "hosts": {
   "10.216.0.155": "10.216.0.155"
 },
 "run_list": [
   "hops::docker_registry"
 ]
}
END_OF_FILE
echo "%password_hidden%" | sudo -S  chef-solo -c /home/troy/.karamel/install/solo.rb -j /home/troy/.karamel/install/hops__docker_registry.json 2>&1 | tee hops__docker_registry.log
echo 'https://github.com/logicalclocks/hops-hadoop-chef/tree/1.3/hops::docker_registry' >> succeed_list
' > hops__docker_registry.sh ; chmod +x hops__docker_registry.sh ; ./hops__docker_registry.sh
', DAG is stuck here :(
INFO  [2020-06-01 10:16:18,870] se.kth.karamel.backend.machines.MachinesMonitor: Sending pause signal to all machines```

I'm not sure what to do with this :(

Hi,
hops::docker_registry is a recipe we just added on master. Master is not stable at the moment.
Could you please run the purge command and also remove the cluster-defns directory so the script re-downloads the correct version of the cluster definition?


Fabio

Hi Fabio,

Thanks for the tip :slight_smile:
I removed the cluster-defns folder and run it again.
Now I see an error related to tensorflow.

set -eo pipefail
echo $(date '+%H:%M:%S'): 'tensorflow__default' >> order
cat > tensorflow__default.json <<-'END_OF_FILE'
{
  "hopsmonitor": {
    "alertmanager": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "purge_telegraf": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "prometheus": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "node_exporter": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "ndb": {
    "mysqld": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "ndbd": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "mgmd": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      },
      "public_key": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDKpBCgHaWPXVQXKvKj2xDmOUS+4/Kt+X3odWRrwTQxLTGnOFO4Ahup5yvaKQRPSBjhpeN50R8aUlxf7aJDaTDTnBgRKCbd3+CyY+MANcBbgd/8j/vxP/aPG/HisYzjiL4XWwrdEy55EknLF/iTUqLvh7TNbmTVJzG/ApKivRPu3I99jlVNObD+eFst/dkYLfiNg69T9adD3jvrnu4IN6q8f/mrB6YXZFW614qkXaly+WtvY6Hl8av7yrELMsj6Ij1VMZszE7XMQaY06X4MDuj5pVDMQY/9wwjPD+L1n5ilguEQZXEg3fL5ObHuVi0dmmkNjBm2Hlku+MutYs1pEnpz mysql@vm00394n"
    }
  },
  "flink": {
    "yarn": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "historyserver": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "hopslog": {
    "_filebeat-spark": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "_filebeat-serving": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "_filebeat-beam": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "_filebeat-kagent": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "kagent": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "python_conda_versions": "3.6"
  },
  "consul": {
    "master": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "hops": {
    "rm": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "ndb": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "nn": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "nm": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "dn": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "jhs": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "tls": {
      "enabled": "false"
    },
    "yarn": {
      "detect-hardware-capabilities": "false",
      "memory_mbs": "29696",
      "vcores": "3",
      "cgroups_strict_resource_usage": "false"
    },
    "rmappsecurity": {
      "actor_class": "org.apache.hadoop.yarn.server.resourcemanager.security.DevHopsworksRMAppSecurityActions"
    }
  },
  "hops_airflow": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "sqoop": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "kzookeeper": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "epipe": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "elastic": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "opendistro_security": {
      "logstash": {
        "password": "19fbc806_201",
        "username": "logstash"
      },
      "epipe": {
        "username": "epipe",
        "password": "19fbc806_201"
      },
      "admin": {
        "username": "admin",
        "password": "19fbc806_201"
      },
      "audit": {
        "enable_transport": "false",
        "enable_rest": "true"
      },
      "jwt": {
        "exp_ms": "1800000"
      },
      "kibana": {
        "password": "19fbc806_201",
        "username": "kibana"
      },
      "elastic_exporter": {
        "username": "elasticexporter",
        "password": "19fbc806_201"
      }
    }
  },
  "hadoop_spark": {
    "historyserver": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "certs": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "yarn": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "hive2": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    },
    "mysql_password": "19fbc806_203"
  },
  "kkafka": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "tensorflow": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "conda": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "livy": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      }
    }
  },
  "hopsworks": {
    "default": {
      "private_ips": [
        "10.216.0.155"
      ],
      "public_ips": [
        "10.216.0.155"
      ],
      "private_ips_domainIds": {
        "10.216.0.155": "0"
      },
      "hosts": {
        "10.216.0.155": "10.216.0.155"
      },
      "public_key": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6CdGfAs8buIa8t1SJEO6dnnK/pQnNGRBH2aMw85Sz/qgCesq6184BffzzEUDBvz+5GjbKKDnFkPl23NouAYE6T5YcG2nFGJzd5NfGLNA7XAKPb1XBAi4jWD4SHuq2SJviebRyKDr8VxLTGpoP5CXzIhsQB9G3DMm//P39l/HzFocAxSLAkWKIePWiTU6CYk6yhUvQqWNTl7JW/+jmvZ8qTWG2YzJKGJeaFuzH5pHlqi+295nwp2GtoFqtzSVsJyBGceRpIqm+lnpo2p/cebdyPZGhYVyAHPFS/rSIi0uIT3lvFZuj8y4q1axdEY7/iuPqO2dvsXsfB3QtWB62oNoJ glassfish@vm00394n"
    },
    "application_certificate_validity_period": "6d",
    "kagent_liveness": {
      "threshold": "40s",
      "enabled": "true"
    },
    "requests_verify": "false",
    "featurestore_online": "true",
    "admin": {
      "password": "19fbc806_201",
      "user": "adminuser"
    },
    "encryption_password": "19fbc806_001",
    "master": {
      "password": "19fbc806_002"
    },
    "https": {
      "port": "443"
    }
  },
  "install": {
    "kubernetes": "false",
    "dir": "/srv/hops",
    "cloud": "on-premises"
  },
  "mysql": {
    "password": "19fbc806_202"
  },
  "alertmanager": {
    "email": {
      "to": "sre@logicalclocks.com",
      "smtp_host": "mail.hello.com",
      "from": "hopsworks@logicalclocks.com"
    }
  },
  "prometheus": {
    "retention_time": "8h"
  },
  "private_ips": [
    "10.216.0.155"
  ],
  "public_ips": [
    "10.216.0.155"
  ],
  "hosts": {
    "10.216.0.155": "10.216.0.155"
  },
  "run_list": [
    "tensorflow::default"
  ]
}
END_OF_FILE
echo "%password_hidden%" | sudo -S  chef-solo -c /home/troy/.karamel/install/solo.rb -j /home/troy/.karamel/install/tensorflow__default.json 2>&1 | tee tensorflow__default.log
echo 'https://github.com/logicalclocks/tensorflow-chef/tree/1.3/tensorflow::default' >> succeed_list
' > tensorflow__default.sh ; chmod +x tensorflow__default.sh ; ./tensorflow__default.sh
', DAG is stuck here :(
INFO  [2020-06-01 18:06:33,609] se.kth.karamel.backend.machines.MachinesMonitor: Sending pause signal to all machines```

Hi,
Sorry for the late reply.
you can find the logs of the recipe execution, including the actual error under /home/troy/.karamel/install/tensorflow__default.log

From experience the recipe sometimes fails with transient error communicating with pypi.org or github.com.


Fabio

Hi Fabio,

If it’s not too much trouble could you please have a look at the attached log file file because I can’t really make sense out of it :frowning:
Thanks again