Generative AI and Data Quality: Implications for Productivity, Labor Displacement, and Policy
Prof. ZhiFeng Cai
Assistant Professor
Department of Economics
Rutgers University
Generative AI relies on high-quality, human-created data, but as AI adoption grows, more new data is generated by AI itself, affecting overall data quality. This paper models generative AI with endogenous AI/labor adoption, where both AI and labor generate data of vary-ing qualities. An AI-data feedback loop emerges: data quality influences AI productivity and adoption, thus affecting the composition (AI- v.s. human-generated) and quality of future data. Drawing on experimental findings in AI research, the model predicts hump-shaped labor market dynamics—significant short-term labor displacement that partially reverses over time as low-quality AI-generated data degrades datasets. Taxing AI adoption or limiting AI’s data access may enhance welfare due to data externalities.













