Hi,
We have a 5 nodes hopsfs cluster with 2 namenodes and 3 datanodes with about 10 pyspark jobs running continuously that writes on hopsfs. Suddenly, the table hdfs_quota_update has began to grow up populating lots of records every seconds. The table growth seems not to be stoppable, so I ask why this table is growing up and if it’s possible to delete it’s content.
File system changes, such as data written, deleted, or moved is written to hdfs_quota_update table. The content of this table is consumed by a thread on the Leader Namenode. After processing the content of this table the rows are deleted. I suspect for some reason (bug) this thread is not making any progress. The first course of action would be to restart the Namenode to see if it helps. you can restart namenode from the command line systemctl restart namenode
or you can do it using the admin panel
Admin → Services → HDFS → namenode → Start/Stop
Hi @salman,
Thank you for your answer and your support. Unfortunally also after namenodes restart the table continues to grow up, I can see many writes every seconds and, I think, no delete. So, it’s possible to clean up this table or disable this mechanism of hdfs quota?
Yes, it is possible to disable quota system. You will have to modify the hdfs-site.xml file in the /srv/hops/hadoop/etc/hadoop folder. Set the property dfs.namenode.quota.enabled to false
You only need to change this parameter on the machine where namenode is running.
Would it be possible for you to share the logs for the namenode so that we can identify the problem and fix it? The logs I am interested in are in /srv/hops/hadoop/logs folder. I would need logs for the namenode which are named hadoop-hdfs-namenode*
If you have not truncated the table and still want to use the quota system then you can fix this by adding the following to hdfs-site.xml for all the namenodes
Hi @salman,
I changed the configuration according to your suggestion but the table keeps growing. Do you have any other ideas? Otherwise I proceed with the deletion of the content. Thanks so much again
Oei! Based on the logs, it was failing because of the largea batch size for quota updates. If reducing the batch size does not help then it could be some other bug.
Proceed with disabling quota. I will try to reproduce the problem.
How big is the hdfs_quota_update table now? Would it be possible to provide me latest logs after you changed the configuration? Thanks for your help
@salman the table now has 6893268 records Here the logs: logs
Thank you for your support
PS. How about this error? SEVERE: Error in NdbJTie: returnCode -1, code 266, mysqlCode 146, status 1, classification 10, message Time-out in NDB, probably caused by deadlock .