alphafold Could find neither hhm_db nor a3m_db 解决日志

@hanq  2022年02月24日 09:27

报错日志

     Traceback (most recent call last):
     File "/app/alphafold/run_alphafold.py", line 445, in <module>
     app.run(main)
     File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
     _run_main(main, args)
     File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
     sys.exit(main(argv))
     File "/app/alphafold/run_alphafold.py", line 429, in main
     is_prokaryote=is_prokaryote)
     File "/app/alphafold/run_alphafold.py", line 177, in predict_structure
     msa_output_dir=msa_output_dir)
     File "/app/alphafold/alphafold/data/pipeline.py", line 225, in process
     use_precomputed_msas=self.use_precomputed_msas)
     File "/app/alphafold/alphafold/data/pipeline.py", line 101, in run_msa_tool
     result = msa_runner.query(input_fasta_path)[0]
     File "/app/alphafold/alphafold/data/tools/hhblits.py", line 144, in query
     stdout.decode('utf-8'), stderr[:500_000].decode('utf-8')))
     RuntimeError: HHblits failed
     stdout:



     stderr:
     - 02:15:01.531 ERROR: Could find neither hhm_db nor a3m_db!
    

资料收集

问题排查

  1. 更改运行指令增加:--db_preset=reduced_dbs, 使用快速预测模式,正常
  2. 单步执行Dockerfile指令,自己构建一个镜像实例

    1. Dockfile中提取关键镜像为:nvidia/cuda:11.1-cudnn8-runtime-ubuntu18.04,基于这个镜像创建docker实例并进入

      sudo docker run -it --name alphafold-ldmf -v /pathTo/alphafold/:/data/alphafold -v /pathTo/alphafold-data/:/data/alphafold-data --runtime=nvidia -e NVIDIA_VISIBLE_DEVICE=all nvidia/cuda:11.1-cudnn8-runtime-ubuntu18.04
  3. 修改dns和pip源,并单步执行Dockerfile下的每一行操作

    • /etc/resolv.conf

      nameserver 114.114.114.114
      nameserver 8.8.8.8
      nameserver 119.29.29.29
      nameserver 223.5.5.5
      
  4. git克隆alphafold_non_docker项目
  5. wget下载alphafold-v2.1.1release代码(alphafold_non_docker作者还没支持最新的v2.1.2),并解压放到alphafold_non_docker下
  6. 修改alphafold_non_docker下的run_alphafold.sh,将alphafold的路径配置好

    current_working_dir=$(pwd) 改为 current_working_dir=$(pwd)/alphafold-2.1.1
  7. 复制测试T1050.fasta文件,运行

    bash run_alphafold.sh -d /data/alphafold-data -o /data/alphafold-result -f /data/alphafold/T1050.fasta -t 2020-05-14
    
  8. 报错RT
  9. 基于日志,得到单步指令,运行HHblits

    /usr/bin/hhblits -i /data/alphafold/T1050.fasta -cpu 4 -oa3m /tmp/tmpsv9qnin5/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /data/alphafold-data/origin_bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /data/alphafold-data/uniclust30/uniclust30_2018_08/uniclust30_2018_08
  10. 搜索hhblits,得到git仓库地址,克隆之后,根据源码,确认第8步命令实际是读取:

    • /data/alphafold-data/origin_bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt下的.ffindex.ffdata文件
    • /data/alphafold-data/uniclust30/uniclust30_2018_08/uniclust30_2018_08下的.ffindex.ffdata文件
  11. 观察到这两个目录下有.ffindex.ffdata,但没有r权限,所以,chmod给读取权限
  12. 重新运行,等待,等待,等待,问题解决。

解决方式

确认并将alphafold数据文件的权限,并给所有的文件加权限

cd /pathTo/alphafold-data/
chmod -R +r ./*

即:资料收集的Singularity - HHblits: Could find neither hhm_db nor a3m_db #202验证有效,需要将数据文件改权限。

附录


添加新评论