Abstract:Aiming at the problem of unclear knowledge of morphological characteristics, classification levels, endangerment coefficients, and protection measures in the field of rare and endangered plants, a knowledge extraction model framework based on Albert is designed to realize the batch extraction of rare and endangered plant knowledge and construct the knowledge graph of rare and endangered plants: 1) On the basis of the existing general plant ontology, the rare and endangered plant ontology is constructed in a top-down manner, and five systems are obtained, namely, species classification system, growth morphological characteristic system, and nomenclature system, conservation status system and ecological habit system; 2) The Albert model was adopted to enhance the representation ability of the text semantics of the rare and endangered plant attribute description text input vector of the downstream task model; 3) The BiLSTM CRF model and BiGRU Attention model are used to realize named entity recognition and relation extraction, respectively, and the effectiveness of the model was verified on the rare and endangered plant data test set, and the results showed that the harmonic mean (F1) values of recall and accuracy of the named entity recognition model and the relation extraction model reached 98.07% and 93.76%, respectively, and the triples formed by a large number of entities and relationships were stored in the graph database Neo4j in order to complete the visual display of the knowledge graph of rare and endangered plants.