GenX Toolkit Part 1 : ACL Anthology Data

Extracting and processing ACL Anthology dataset and then generating new text for evaluation using GenX Toolkit.

Extracting and Processing Abstracts from ACL Anthology Data

The red box marks the BibTeX file to download
Code for extracting entries from the BibTeX file.
Example of an ACL Anthology entry extracted from the downloaded BibTeX file.
Example of the processed ACL Abstract.
Code to process, extract, split and then save the ACL abstracts.

Training and Generation

This code trains a GPT2 model on the ACL abstracts data and the generates new abstracts.

Data Scientist working in the field of NLP, NLG and NLU