# Large language models (LLMs) and retrieval augmented generation (RAG) for physics-specific queries using the OpenAI API

2025-01-15 - James Verbus

https://www.linkedin.com/in/jamesverbus/

# Prerequisites

1) You need to set up an Open AI account. https://platform.openai.com/


2) Create an OpenAI API key. https://platform.openai.com/settings/organization/api-keys

3) Add your API keys as secret variables in your notebook: "OPENAI_API_KEY" and "OPENAI_ORGANIZATION"

4) The first half of the notebook can be completed using a free account. If you want to complete the second half, you need to fund your OpenAI account for API calls ($1 should be enough).


# Setup environment

In [None]:
from IPython.display import HTML, display

def set_css():
 display(HTML('''
 
 '''))
get_ipython().events.register('pre_run_cell', set_css)

## Install packages

In [None]:
!pip install llama-index==0.12.3 openai==1.54.5 pypdf==5.1.0 httpx==0.27.2



## Imports

In [None]:
import openai, os

from google.colab import drive, userdata
from llama_index.core import ServiceContext, Settings, SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.node_parser import SimpleNodeParser
from llama_index.llms.openai import OpenAI

## Setup Open AI keys

You need to set up an Open AI account, fund it for API calls, and then add your API keys as environmental variables in your notebook.

---



In [None]:
openai.organization = userdata.get("OPENAI_ORGANIZATION")
openai.api_key = userdata.get("OPENAI_API_KEY")

# Define queries

In [None]:
questions = {
 1: "In the LUX D-D analysis, what neutron source rate provided optimal match between the absolute number of single-scatter events in simulation and data",
 2: "In the first results from the LUX detector, what was the average electric field used when measuring the charge and light yields?",
 3: "What is the mean energy of neutrons produced by a DD108 fusion source?",
 4: "What is the energy at the endpoint of the D-D neutron recoil energy spectrum in liquid xenon? What was the size of the S1 and S2 signals observed in LUX at this endpoint",
 5: "How low in energy was the ER response measured using 127Xe? Where did the 127Xe come from?"
 # Add your questions 6, 7, ... here
}

# Query OpenAI gpt-4o-mini directly (without RAG)



Available OpenAI models and limits are listed here: https://platform.openai.com/settings/organization/limits

In [None]:
client = openai.OpenAI(
 api_key=userdata.get("OPENAI_API_KEY"),
)

def query_oai(question):
 chat_completion = client.chat.completions.create(
 messages=[
 {
 "role": "user",
 "content": question
 }
 ],
 model="gpt-4o-mini",
 )

 return chat_completion

## Question 1

In [None]:
question = questions[1]
response = query_oai(question)

print("Question: " + question)
print("")
print("Response: " + response.choices[0].message.content)

Question: In the LUX D-D analysis, what neutron source rate provided optimal match between the absolute number of single-scatter events in simulation and data

Response: In the LUX (Large Underground Xenon) dark matter experiment, the optimal neutron source rate that provided an effective match between the absolute number of single-scatter events in the simulated data and the actual observational data was identified as **10 neutrons per day**. This value was crucial for aligning the simulation results with the measured data, thereby enhancing the overall accuracy and reliability of the experiment’s findings regarding dark matter interactions.


## Question 2

In [None]:
question = questions[2]
response = query_oai(question)

print("Question: " + question)
print("")
print("Response: " + response.choices[0].message.content)

Question: In the first results from the LUX detector, what was the average electric field used when measuring the charge and light yields?

Response: In the first results from the LUX (Large Underground Xenon) detector, the average electric field used when measuring the charge and light yields was approximately 0.25 kV/cm. This value was employed in the experiments to optimize the detection of light and charge produced by potential dark matter interactions within the detector's xenon target.


## Question 3

In [None]:
question = questions[3]
response = query_oai(question)

print("Question: " + question)
print("")
print("Response: " + response.choices[0].message.content)

Question: What is the mean energy of neutrons produced by a DD108 fusion source?

Response: DD108 refers to a specific neutron source that utilizes deuterium-deuterium (D-D) fusion reactions. In general, D-D fusion can produce different types of reactions, but the primary reactions relevant to neutron production are:

1. \( D + D \rightarrow T + p \) (produces a proton)
2. \( D + D \rightarrow He + n \) (produces a neutron)

The mean energy of the neutrons produced by these fusion reactions typically ranges in the order of several MeV (Mega electron Volts). In the case of the D-D reaction producing a neutron, the neutron energy is commonly around \( 2.5 \) MeV.

For the DD108 fusion source, it's important to look at specific manufacturer data or research publications for precise figures, but generally, the output neutron energy is often around 2.5 to 2.45 MeV.


## Question 4

In [None]:
question = questions[4]
response = query_oai(question)

print("Question: " + question)
print("")
print("Response: " + response.choices[0].message.content)

Question: What is the energy at the endpoint of the D-D neutron recoil energy spectrum in liquid xenon? What was the size of the S1 and S2 signals observed in LUX at this endpoint

Response: In the context of dark matter searches and the detection of nuclear recoils, energy spectra can provide important information about various processes. For deuterium-deuterium (D-D) fusion reactions, neutron production can result in nuclear recoils, and in liquid xenon, the energy of these recoils can be directly correlated to the light and charge signals detected in experiments like LUX (Large Underground Xenon).

The recoil energy spectrum for neutron recoils in liquid xenon typically has a maximum energy, which corresponds to the endpoint of the spectrum. This maximum recoil energy is roughly in the range of a few MeV per neutron, often around 2-3 MeV depending on the kinematics of the particular reaction.

For the LUX experiment, the S1 and S2 signals directly correspond to the scintillation lig

## Question 5

In [None]:
question = questions[5]
response = query_oai(question)

print("Question: " + question)
print("")
print("Response: " + response.choices[0].message.content)

Question: How low in energy was the ER response measured using 127Xe? Where did the 127Xe come from?

Response: The energy levels for the ER (excitation-resolved) response measured using \(^{127}\text{Xe}\) typically refer to low-energy nuclear reactions or resonance states relevant to experimental nuclear physics studies. The specific low-energy threshold of the measured ER response would depend on the details of the experiment, including the method used and the reactions being investigated. Typically, these measurements might focus on energies below a few MeV, but for precise values, you would need to refer to specific experimental papers or results that detail the findings.

As for the origin of \(^{127}\text{Xe}\), this isotopes is stable and can be found naturally in trace amounts in the Earth's atmosphere and in some minerals. \(^{127}\text{Xe}\) can also be produced artificially in nuclear reactors or accelerators, where xenon isotopes are created through neutron capture process

# Setup basic RAG system with Brown Particle Astrophysics Group Thesis in a Google drive

NOTE: You will need to have funded your OpenAI account to proceed past this point. You need a Tier 1 account to execute the OpenAI API calls.

## Fetch theses and add them to Google Drive

Brown Particle Astrophysics Group Theses

Copied from: https://particleastro.brown.edu/graduate-theses/

In [None]:
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
top_level_path = '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/'
module_path = 'Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)'

In [None]:
os.chdir(top_level_path + '/' + module_path + '/' + 'documents/lux_dd_papers')

In [None]:
!ls

1608.05381v2.pdf 1-s2.0-S0168900217301158-am.pdf


## Setup Llama index

In [None]:
# Set basic Llama index parameters
llama_index_data_path = top_level_path + '/' + module_path + '/' + '/documents/lux_dd_papers'
Settings.chunk_size = 1000
Settings.chunk_overlap = 100
Settings.llm = OpenAI(model="gpt-4o-mini")

This next step takes a few minutes. Let's review what is happening while the code runs.

In [None]:
# Read mounted Google Drive with Llama index
documents = SimpleDirectoryReader(llama_index_data_path, recursive = True).load_data()

# Create index from documents
index = VectorStoreIndex.from_documents(documents)

# Query OpenAI gpt-4o-mini with RAG

In [None]:
def query_rag(question):
 query_engine = index.as_query_engine(similarity_top_k=5)
 response = query_engine.query(question)

 return response

## Question 1

In [None]:
question = questions[1]
response = query_rag(question)

In [None]:
print("Question: " + question)
print("")
print("Response: " + response.response)

Question: In the LUX D-D analysis, what neutron source rate provided optimal match between the absolute number of single-scatter events in simulation and data

Response: The neutron source rate of 2.6×10^6 n/s was used for data normalization in the LUX D-D analysis, which is in agreement with the independently measured source rate of (2.5 ± 0.3) × 10^6 n/s. This agreement confirms the consistency between the data and simulation in both absolute rate and shape.


## Question 2

In [None]:
question = questions[2]

response = query_rag(question)

print("Question: " + question)
print("")
print("Response: " + response.response)

Question: In the first results from the LUX detector, what was the average electric field used when measuring the charge and light yields?

Response: The average electric field used when measuring the charge and light yields in the first results from the LUX detector was 180 V/cm.


## Question 3

In [None]:
question = questions[3]

response = query_rag(question)

print("Question: " + question)
print("")
print("Response: " + response.response)

Question: What is the mean energy of neutrons produced by a DD108 fusion source?

Response: The mean energy of neutrons produced by a DD108 fusion source is approximately 2.45 MeV.


## Question 4

In [None]:
question = questions[4]

response = query_rag(question)

print("Question: " + question)
print("")
print("Response: " + response.response)

Question: What is the energy at the endpoint of the D-D neutron recoil energy spectrum in liquid xenon? What was the size of the S1 and S2 signals observed in LUX at this endpoint

Response: The energy at the endpoint of the D-D neutron recoil energy spectrum in liquid xenon is 74 keV nr. At this endpoint, the mean S1 signal observed was 2500 phd, and a raw S2 analysis threshold of 164 phd was applied.


## Question 5

In [None]:
question = questions[5]

response = query_rag(question)

print("Question: " + question)
print("")
print("Response: " + response.response)

Question: How low in energy was the ER response measured using 127Xe? Where did the 127Xe come from?

Response: The electron recoil (ER) response was measured using 131mXe, which is a metastable state of xenon resulting from cosmogenic activation. The measurements were conducted at energies close to 122 keVee. The specific low-energy threshold for the ER response using 127Xe is not detailed in the provided information.


In [None]:
response.metadata

{'9e523246-bf4e-42b2-9f58-9b09af5ec736': {'page_label': '7',
 'file_name': '1-s2.0-S0168900217301158-am.pdf',
 'file_path': '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)/documents/lux_dd_papers/1-s2.0-S0168900217301158-am.pdf',
 'file_type': 'application/pdf',
 'file_size': 11051497,
 'creation_date': '2025-01-13',
 'last_modified_date': '2025-01-13'},
 'f9ce9669-2ac8-4f15-92b5-a02ec09a6501': {'page_label': '21',
 'file_name': '1608.05381v2.pdf',
 'file_path': '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)/documents/lux_dd_papers/1608.05381v2.pdf',
 'file_type': 'application/pdf',
 'file_size': 4399092,
 'creation_date': '2025-01-13',
 'last_modified_date': '2025-01-13'},
 '42451e0e-638d-4c21-979c-c3c28ea71eae': {'page_label': '3',
 'file_name': '1-s2.0-S0168900217301158-a

In [None]:
response.source_nodes

[NodeWithScore(node=TextNode(id_='9e523246-bf4e-42b2-9f58-9b09af5ec736', embedding=None, metadata={'page_label': '7', 'file_name': '1-s2.0-S0168900217301158-am.pdf', 'file_path': '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)/documents/lux_dd_papers/1-s2.0-S0168900217301158-am.pdf', 'file_type': 'application/pdf', 'file_size': 11051497, 'creation_date': '2025-01-13', 'last_modified_date': '2025-01-13'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={: RelatedNodeInfo(node_id='2480a473-d79f-4a9c-848c-c0933595225e', node_type='4', metadata={'page_label': '7', 'file_name': '1-s2.0-S0168900217301158-am.pdf', 'file_path': '/content/drive/Shared drives/AI Winter

# What if we want to add more documents?

## Load the new directory of documents from Google Drive, parse them, and then insert them into the existing index

In [None]:
# Set basic Llama index parameters
new_llama_index_data_path = top_level_path + '/' + module_path + '/' + 'documents/other_brownpa_theses'

# Read mounted Google Drive with Llama index
new_documents = SimpleDirectoryReader(new_llama_index_data_path, recursive = True).load_data()

# Get nodes from the new documents
parser = SimpleNodeParser()
new_nodes = parser.get_nodes_from_documents(new_documents)

# Insert the nodes into the existing index.
index.insert_nodes(new_nodes)

In [None]:
question = questions[5]

response = query_rag(question)

print("Question: " + question)
print("")
print("Response: " + response.response)

Question: How low in energy was the ER response measured using 127Xe? Where did the 127Xe come from?

Response: The lowest-energy ER response measured using 127Xe reached down to 186 eV energy deposition. The 127Xe present in the LXe target is a result of cosmogenic activation during its time on the surface before being brought underground.


In [None]:
response.metadata

{'8c34f555-164e-4485-a47f-31a97cad6079': {'page_label': '158',
 'file_name': '20220429_Taylor_PhD_Thesis.pdf',
 'file_path': '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)/documents/other_brownpa_theses/20220429_Taylor_PhD_Thesis.pdf',
 'file_type': 'application/pdf',
 'file_size': 35637882,
 'creation_date': '2025-01-15',
 'last_modified_date': '2025-01-13'},
 'bcace8db-4477-4c0e-ad4f-c8cbf9ea7503': {'page_label': '78',
 'file_name': '20200501_Huang_PhD_thesis_Brown_physics_2019_submit_to_Brown_Repo_v5.pdf',
 'file_path': '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)/documents/other_brownpa_theses/20200501_Huang_PhD_thesis_Brown_physics_2019_submit_to_Brown_Repo_v5.pdf',
 'file_type': 'application/pdf',
 'file_size': 49934873,
 'creation_date': '2025-01-15',
 'last_modifi

In [None]:
response.source_nodes

[NodeWithScore(node=TextNode(id_='8c34f555-164e-4485-a47f-31a97cad6079', embedding=None, metadata={'page_label': '158', 'file_name': '20220429_Taylor_PhD_Thesis.pdf', 'file_path': '/content/drive/Shared drives/AI Winter School (Brown Physics CFPU 2025)/Module 5 -- Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)/documents/other_brownpa_theses/20220429_Taylor_PhD_Thesis.pdf', 'file_type': 'application/pdf', 'file_size': 35637882, 'creation_date': '2025-01-15', 'last_modified_date': '2025-01-13'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={: RelatedNodeInfo(node_id='e473dd35-f541-4def-b17c-bd71464aceb2', node_type='4', metadata={'page_label': '158', 'file_name': '20220429_Taylor_PhD_Thesis.pdf', 'file_path': '/content/drive/Shared drives/A

# Exercise: Google Form question

Please run the following query and submit your answer in the Google Form for Module 5.

Module 5 - https://docs.google.com/forms/d/e/1FAIpQLSfR1Pu7hQcaax-gS4UpOUp5fpg4PQET9njdOuLfWSfwwkR7Aw/viewform?usp=sharing

In [None]:
final_query = "What was the energy of the lowest Qy measurement achieved with DD2016 calibration data? How much lower in energy was this than the 2013 LUX Qy result?"
response = query_rag(final_query)

In [None]:
print("Question: " + final_query)
print("")
print("Response: " + response.response)