欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

知识图谱DKN源码详解(三)config.py

程序员文章站 2022-03-04 13:13:09
...
class BaseConfig():
    """
    General configurations appiled to all models
    """
    num_epochs = 2   #迭代次数
    num_batches_show_loss = 100  # Number of batchs to show loss
	num_batches_validate = 1000   # Number of batchs to check metrics on validation dataset
    batch_size = 128          # 一个batch里几个句子
    learning_rate = 0.0001    # 学习率
    num_workers = 4  # Number of workers for data loading  根据你的CPU数量有关
    num_clicked_news_a_user = 50  # Number of sampled click history for each user
    num_words_title = 20  #一个标题有几个单词
    num_words_abstract = 50  #一个摘要有多少个单词
    word_freq_threshold = 1  #单词频度的阈值,多少才能认为是感兴趣的单词
    entity_freq_threshold = 2  #实体频度阈值,多少才能认为是感兴趣的实体
    entity_confidence_threshold = 0.5  #实体置信度阈值,多少才能被认为是有用的实体
    negative_sampling_ratio = 2  # K  #消极样本取样率
    dropout_probability = 0.2     #dropout 防止过拟合
    # Modify the following by the output of `src/dataprocess.py`
    num_words = 1 + 101220   #单词量
    num_categories = 1 + 295 #类别数量
    num_entities = 1 + 21842  #实体数量
    num_users = 1 + 711222   #用户数量
    word_embedding_dim = 300  #单词嵌入维度
    category_embedding_dim = 100  #类别嵌入维度
    # Modify the following only if you use another dataset
    entity_embedding_dim = 100  #实体嵌入维度
    # For additive attention
    query_vector_dim = 200   #查询向量维度