Hate speech is a major problem on social media platforms. Automatic hate speech detection methods relying on machine learning models, which learn from manually labeled datasets, have been proposed in both academia and industry. However, there is increasing evidence that hate speech detection datasets labeled by general annotators (e.g., amateurs or MTurk workers) contain systematic bias, as they cannot effectively consider language use differences among different speakers. When such biased datasets are used to train machine learning models, the resulting models will also be biased. Unlike general annotators, experts can produce much less biased annotations. However, expert annotations cannot be efficiently obtained in large quantity. This paper bridges the gap by adopting a weakly supervised learning method for hate speech detection using a small number of expert annotations. We propose a novel design that uses contrastive learning and prompt-based learning based on large language models, incorporating a group estimator, a pair generator, and knowledge injection. Using real-world Twitter posts written by African American English speakers and other racial groups as an example, extensive experiments were conducted to demonstrate the superior performance of the proposed method. The proposed approach was also evaluated on data in the LGBTQ+ community and achieved consistent results. The study has important academic and practical implications for hate speech detection and large language models.

3910 2404
KK 822
「与其为个人创造财富,不如为世界创造改变」是李哲鹏博士的格言。李博士一直致力透过学术研究推动资讯科技和机器学习的创新和进步。
In social networks, social foci are physical or virtual entities around which social individuals organize joint activities, for example, places and products (physical form) or opinions and services (virtual form). Forecasting which social foci will diffuse to more social individuals is important for managerial functions such as marketing and public management operations. In terms of diffusive social adoptions, prior studies on user adoptive behavior in social networks have focused on single-item adoption in homogeneous networks. We advance this body of research by modeling scenarios with multi-item adoption and learning the relative propagation of social foci in concurrent social diffusions for online social networking platforms. In particular, we distinguish two types of social nodes in our two-mode social network model: social foci and social actors. Based on social network theories, we identify and operationalize factors that drive social adoption within the two-mode social network. We also capture the interdependencies between social actors and social foci using a bilateral recursive process—specifically, a mutual reinforcement process that converges to an analytical form. Thus, we develop a gradient learning method based on a mutual reinforcement process that targets the optimal parameter configuration for pairwise ranking of social diffusions. Further, we demonstrate analytical properties of the proposed method such as guaranteed convergence and the convergence rate. In the evaluation, we benchmark the proposed method against prevalent methods, and we demonstrate its superior performance using three real-world data sets that cover the adoption of both physical and virtual entities in online social networking platforms.




