General questions about Hopsworks platform

Hello,

We (me and my team) are using Hadoop ecosystem solutions. Since its became paying, we’re looking for a new solutions.
So, I had a look on Hopsworks platform, its a good project, but I would like to ask some questions:

  • Is HopsFS like HDFS of hadoop ?
  • Is Hopsworks an open-source project ?
  • Could we connect some services of hadoop (like HBase for example) to Hopsworks platform ?
  • Does Hopsworks have an equivalent of Apache Hive of Hadoop ?

Thanks in advance,

Best regards,

Hi smach_h

Both Hops and Hopsworks are open-source. Hops is Apache V2 licensed, Hopsworks is AGPL-V3 licensed.

HopsFS is a new version of HDFS. It has the same API, but

  • its security model is now TLS (X.509 certs), not Kerberos;
  • it supports multiple stateless NameNodes with no wait to failover from an active to a standby NameNode;
  • NameNode metadata is stored in an external database (NDB by default)
  • Blocks can either be stored locally on datanodes or in object stores (S3 or Azure Blob Storage)

Hive is part of Hopsworks. It is a modified version of Hive, where the metadata is stored in the same database (NDB) as HopsFS.

We have not added support for HBase to Hops/Hopsworks. If you want security, you would need to switch HBase from Kerberos to TLS, which is a good bit of work.

References: