This code package is the official implementation of the research paper:
Title: "Artificial Intelligence-assisted Literature Screening with Empirical Validation in Reinforced Autoclaved Aerated Concrete Research"
Status: In Preparation
Authors: Liu, L. et al.
Year: 2025
Important Notice: This code package is currently under embargo and will be made publicly accessible after the associated research paper is published. The repository is maintained for reference purposes during the peer review process.
This repository contains the full Python codebase supporting our research on AI-assisted literature screening and empirical validation in RAAC research. The code implements a comprehensive, AI-powered workflow to:
The scripts were developed using Claude 3 Opus API and tested on a comprehensive dataset of UK-based RAAC research documents, providing automated literature screening capabilities for researchers.
Note: Scripts 1-4 (Definition Extraction and Explicit Mention Check) are the core validated components supporting the main research paper. Scripts 5-6 (Defect Analysis) are experimental components developed as part of this research but not yet empirically validated - they are provided for the research community to build upon.
| File | Description |
|---|---|
README.md |
Project documentation and setup instructions |
| Script | Description |
|---|---|
01-a_RAAC_DefinitionExtraction.py
|
Extracts RAAC definitions or generates contextual summaries |
01-b_RAAC_DefinitionPresenceCheck_BinaryFlag.py
|
Binary classification for RAAC definition presence |
| Script | Description |
|---|---|
02-a_RAAC_ExplicitMentionCheck.py
|
Determines explicit RAAC mentions in documents |
02-b_RAAC_ExplicitMentionCheck_BinaryFlag.py
|
Binary classification for RAAC mentions |
| Script | Description |
|---|---|
03_DefectExtraction_SevenQuestions.py |
Extracts defect information using 7-question framework |
04_CombineExtractedDefectData.py |
Aggregates defect data into master dataset |
Clone the repository:
git clone https://github.com/lixuliu/raac-llm-analysis-code-package.git
cd raac-llm-analysis-code-package
Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Set up your API key:
export ANTHROPIC_API_KEY='your-api-key-here' # On Windows: set ANTHROPIC_API_KEY=your-api-key-here
Set up your environment:
# Set your Anthropic API key
export ANTHROPIC_API_KEY='your-api-key-here' # On Windows: set ANTHROPIC_API_KEY=your-api-key-here
# Install required dependencies
pip install -r requirements.txt
Prepare your data:
Run the analysis pipeline:
# Step 1a: Extract RAAC definitions
python "01-a_RAAC_DefinitionExtraction.py"
# Step 1b: Check for RAAC definition presence
python "01-b_RAAC_DefinitionPresenceCheck_BinaryFlag.py"
# Step 2a: Check for explicit RAAC mentions
python "02-a_RAAC_ExplicitMentionCheck.py"
# Step 2b: Check for RAAC mention presence
python "02-b_RAAC_ExplicitMentionCheck_BinaryFlag.py"
# Step 3: Extract defect information
python 03_DefectExtraction_SevenQuestions.py
# Step 4: Combine the results
python 04_CombineExtractedDefectData.py
This research was supported by:
This project is licensed under the
Creative Commons Attribution-NonCommercial 4.0 International
License (CC BY-NC 4.0).
You are free to share and adapt the material for non-commercial
purposes, provided you give appropriate credit.
Commercial use is not permitted without explicit permission from
the author.