Timeout when I'm trying to connect with mysql

Hi, I’m exploring a little bit Hopsworks.ai and I add a new connector with mysql database but I’m getting an apparently timeout.
Do you have any suggestion about how can I test if my connection is correctly settled? thanks!

An error was encountered:
An error occurred while calling o2680.load.
: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.(MysqlIO.java:358)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2498)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2535)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2320)
at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:834)
at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:46)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:347)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:63)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:54)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:210)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at java.net.Socket.connect(Socket.java:556)
at java.net.Socket.(Socket.java:452)
at java.net.Socket.(Socket.java:262)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:256)
at com.mysql.jdbc.MysqlIO.(MysqlIO.java:308)
… 32 more

Traceback (most recent call last):
File “/srv/hops/anaconda/anaconda/envs/python36/lib/python3.6/site-packages/hops/featurestore.py”, line 317, in get_feature
jdbc_args=jdbc_args, online=online)
File “/srv/hops/anaconda/anaconda/envs/python36/lib/python3.6/site-packages/hops/featurestore_impl/core.py”, line 416, in _do_get_feature
_register_on_demand_featuregroups_as_temp_tables(on_demand_featuregroups, featurestore, jdbc_args)
File “/srv/hops/anaconda/anaconda/envs/python36/lib/python3.6/site-packages/hops/featurestore_impl/core.py”, line 671, in _register_on_demand_featuregroups_as_temp_tables
spark_df = _do_get_on_demand_featuregroup(fg, featurestore, j_args)
File “/srv/hops/anaconda/anaconda/envs/python36/lib/python3.6/site-packages/hops/featurestore_impl/core.py”, line 737, in _do_get_on_demand_featuregroup
.option(constants.SPARK_CONFIG.SPARK_JDBC_DBTABLE, “(” + featuregroup.on_demand_featuregroup.query + “) fs_q”)
File “/srv/hops/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py”, line 172, in load
return self._df(self._jreader.load())
File “/srv/hops/spark/python/lib/py4j-src.zip/py4j/java_gateway.py”, line 1257, in call
answer, self.gateway_client, self.target_id, self.name)
File “/srv/hops/spark/python/lib/pyspark.zip/pyspark/sql/utils.py”, line 63, in deco
return f(*a, **kw)
File “/srv/hops/spark/python/lib/py4j-src.zip/py4j/protocol.py”, line 328, in get_return_value
format(target_id, “.”, name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o2680.load.
: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.(MysqlIO.java:358)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2498)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2535)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2320)
at com.mysql.jdbc.ConnectionImpl.(ConnectionImpl.java:834)
at com.mysql.jdbc.JDBC4Connection.(JDBC4Connection.java:46)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:347)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:63)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:54)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:210)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at java.net.Socket.connect(Socket.java:556)
at java.net.Socket.(Socket.java:452)
at java.net.Socket.(Socket.java:262)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:256)
at com.mysql.jdbc.MysqlIO.(MysqlIO.java:308)
… 32 more

Hi,

we tried to work out what’s causing your error, but we need a few more details to be able to help.
Can you explain what exactly you did to end up with the error?
E.g. did you register an on-demand featuregroup from your MySQL database with the feature store and now you are trying to access a feature?

Usually, if you get a SQLException: Connection refused or Connection timed out or a MySQL specific CommunicationsException:
Communications link failure , then it means that your DB isn’t reachable.

When calling the get_feature how did you provide the jdbc_args of the storage connector that you previously created? E.g. url/connection string, password, user?

You can also try and access a table in your DB directly through Spark, to make sure you are able to establish a connection:

dataframe_mysql = spark.read.format("jdbc").options(
    url="jdbc:mysql://localhost:3306/my_db_name",
    dbtable = "my_tablename",
    user="username",
    password="password").load()

where spark is your Spark session, and you have to replace the other options with your specific DB details.

Thanks!

Thanks @moritzmeister! I tried to establish the connection through spark (without success) I am sure that it’s because of the proxy settings I am working on. Do you know if the proxy should be configured directly from the spark configuration? or is there any hopsworks functionality to configure it? thanks!

I’ve tested the connection into my local env and it works, so I don’t know if it’s a problem with mysql driver into hopsworks UI env (I’m trying to connect to mysql database using jupyter notebooks in the hopsworks demo cluster)

Okay, without knowing your exact proxy setup, it is hard to tell what’s going on.

To exclude that Spark is the problem, can you try establishing a connection through SQLAlchemy and read something through for example pandas?

from sqlalchemy import create_engine
import pandas as pd

db_connection_str = 'mysql://username:password@host:port/database'
db_connection = create_engine(db_connection_str)

df = pd.read_sql('SELECT * FROM table_name', con=db_connection)

Another thing you can try is to attach your MySQL driver jar to the Spark session, you can do so when you start Jupyter in the “Advanced Configuration” section. We are adding a MySQL driver to Spark by default, but maybe it is incompatible.
image

Thank you @moritzmeister! I’ve tried your suggestions but I got the next message:

    sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (2003, "Can't connect to MySQL server on '172.16.22.45' (110)")
(Background on this error at: http://sqlalche.me/e/e3q8)

Maybe it can help if I tell you that I want to experiment with hopsworks creating my features (and manipulating them) within the feature store, reading my data directly from my database, creating groups etc. etc. Is there any way I can do it directly with python (and not with jupyter notebooks) inside hopsworks.ai?

I think that the problem is that the server doesn’t listen to outside requests. we have to edit: /etc/mysql/conf.d/
and add bind-address = 0.0.0.0
but I think I don’t have permissions to do so.

Yes, hopsworks needs to be able to reach the database server.
Is the database running in AWS, too?

To connect to the feature store from any Python environment, we have a SDK, however, it only includes a subset of the functionality of the hops client library on hopsworks, as we don’t have Spark available as execution engine on a pure Python environment. Instead it is using Pandas Dataframes.

To get started with the SDK, follow the guide here: https://hopsworks.readthedocs.io/en/latest/featurestore/integrations/guides/custom.html
You can find the documentation of the API here: http://hopsworks-cloud-sdk.logicalclocks.com/