OpenAI launches open-source toolkit for social science research
GABRIEL uses GPT to turn unstructured text and images into quantitative data for academic study
OpenAI has released GABRIEL, an open-source toolkit designed to help social scientists convert large volumes of unstructured text and images into measurable, quantitative data.
The Python library allows researchers to describe what they want to measure in plain language and then applies that prompt consistently across thousands or millions of documents, returning a numerical score for each one.
OpenAI said it benchmarked GPT's accuracy at labelling qualitative data across a range of use cases and found it performed well, citing examples such as tracking methodologies in scientific papers, measuring attention given to topics in course curricula, extracting structured historical records for small European towns, and analysing customer reviews at scale.
Beyond measurement, GABRIEL includes tools for merging mismatched datasets, deduplication, passage coding, generating new research hypotheses and stripping personal information to protect privacy.
Related reading
- OpenAI's GPT-5.2 conjectured a physics formula that its authors later proved correct
- OpenAI says its Codex AI agent wrote every line of code in new product repository
- Deepgram updates multilingual speech-to-text model with improved code-switching accuracy
The toolkit is aimed at economists, social scientists and data scientists, and is designed to require minimal technical expertise, with a tutorial notebook included.
OpenAI, the artificial intelligence company behind ChatGPT, said it would continue developing GABRIEL based on feedback from the academic community.
The Recap
- OpenAI released GABRIEL, an open-source toolkit for qualitative measurement.
- Applies researchers’ questions across thousands or millions of documents.
- The company will refine GABRIEL based on academic community feedback.