Abstract:Community discovery is a key issue in data mining to deal with complex networks. It is also recommended for subsequent social network content or combined with the power grid field to support the characterization of electricity users in the power field and help block managers to better serve block. Existing community discovery algorithms are often based on the idea of expansion. The seed nodes selected by the algorithm may be adjacent to each other, resulting in a high overlap of the expanded community structure and inconspicuous division. During the expansion process, the order between unassigned nodes is ignored. Insufficient consideration of influence among assignable nodes. Aiming at the above problems, this paper proposes a parallelized community discovery algorithm based on seed filtering and node influence. The algorithm first introduces a seed filtering mechanism when selecting key seeds to eliminate adjacent seed nodes and reduce the overlap of community structure; secondly, in the process of expansion, the similarity and distance between nodes and communities are used to quantify the influence of nodes, and priority is given to adding nodes with high influence. Finally, the above community discovery algorithm is parallelized to deal with the problem of community division in large-scale social networks.