GLoRE

A unified logical reasoning benchmark and evaluation pipeline across datasets such as LogiQA, ReClor, FOLIO, AR-LSAT, ProofWriter, and RuleTaker.

Abstract placeholder image for GLoRE

GLoRE is a logical reasoning project centered on evaluating model performance consistently across a diverse set of benchmarks. I contributed to the benchmark integration and evaluation pipeline spanning datasets such as LogiQA, ReClor, FOLIO, AR-LSAT, ProofWriter, and RuleTaker.

The main value of the project is in making comparisons more unified and interpretable across reasoning settings that are often studied separately. That makes it easier to analyze strengths, weaknesses, and transfer patterns in model reasoning.

You can keep refining this page directly in markdown with details about datasets, methods, results, or links to papers and code.