融合Albert模型的珍稀濒危植物知识图谱的构建
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

上海市科学技术委员会项目(20dz1203800)


Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对珍稀濒危植物形态特征、分类等级、濒危系数、保护措施等知识不明确的问题,设计了文本融合轻量级双向转换编码表示模型(Albert)的知识抽取模型框架,实现批量抽取珍稀濒危植物知识,从而构建珍稀濒危植物知识图谱:1) 在现存一般性植物本体的基础上,采用自顶向下的方式构建珍稀濒危植物本体,得到5个体系,即物种分类体系、生长形态特征体系、命名体系、保护现状体系和生态习性体系;2) 采取Albert预训练模型来增强下游任务模型输入向量的珍稀濒危植物属性描述文本语义的表征能力;3) 利用BiLSTM–CRF模型和BiGRU–Attention模型分别实现命名实体识别和关系抽取。在珍稀濒危植物数据测试集上对模型的有效性进行验证,结果表明,命名实体识别模型和关系抽取模型的召回率和准确率的调和平均值(F1)值分别达到98.07%和93.76%,将得到的大量的实体和关系所形成的三元组存储在图数据库Neo4j中,完成珍稀濒危植物知识图谱的可视化展示。

    Abstract:

    Aiming at the problem of unclear knowledge of morphological characteristics, classification levels, endangerment coefficients, and protection measures in the field of rare and endangered plants, a knowledge extraction model framework based on Albert is designed to realize the batch extraction of rare and endangered plant knowledge and construct the knowledge graph of rare and endangered plants: 1) On the basis of the existing general plant ontology, the rare and endangered plant ontology is constructed in a top-down manner, and five systems are obtained, namely, species classification system, growth morphological characteristic system, and nomenclature system, conservation status system and ecological habit system; 2) The Albert model was adopted to enhance the representation ability of the text semantics of the rare and endangered plant attribute description text input vector of the downstream task model; 3) The BiLSTM CRF model and BiGRU Attention model are used to realize named entity recognition and relation extraction, respectively, and the effectiveness of the model was verified on the rare and endangered plant data test set, and the results showed that the harmonic mean (F1) values of recall and accuracy of the named entity recognition model and the relation extraction model reached 98.07% and 93.76%, respectively, and the triples formed by a large number of entities and relationships were stored in the graph database Neo4j in order to complete the visual display of the knowledge graph of rare and endangered plants.

    参考文献
    相似文献
    引证文献
引用本文

田梦晖,陈明,席晓桃.融合Albert模型的珍稀濒危植物知识图谱的构建[J].湖南农业大学学报(自然科学版),2023,49(5):.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-11-10
  • 出版日期:
文章二维码