Date of Award
4-2025
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Dr. Alfredo Perez
Abstract
This thesis explores the application of Large Language Models (LLMs) to improve the generation, classification, and accessibility of privacy policy content. As frameworks such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) evolve, organizations face growing pressure to present transparent, user-friendly privacy policies. However, the complexity of legal language and the variability of policy structures pose challenges for traditional analysis methods. This research addresses these issues using the LLaMA 2 open-source models developed by Meta. The study centers on two key investigations. The first evaluates the base LLaMA 2 model (zero-shot) and its 4-bit, 5-bit, and 8-bit quantized versions for generating coherent, legally sound privacy policy text for IoT applications. Outputs are assessed using ROUGE- LSum, BERT Precision Score, Word2Vec, and GloVe to measure fluency, contextual relevance, and regulatory alignment. Findings highlight trade-offs between model efficiency and output quality, demonstrating the potential of quantized models in resource-constrained environments. The second case study addresses sentence-level multi-label classification using the OPP- 115 dataset, which includes web-based privacy policies from diverse domains. It evaluates the performance of base, fine-tuned, and quantized fine-tuned LLaMA 2 models in cate- gorizing individual privacy statements into key regulatory categories—such as First Party Collection/Use, Third Party Sharing/Collection, and Data Retention—among others. Fine- tuning significantly improves classification performance, while quantized fine-tuned models achieve comparable accuracy with lower computational demands. Additionally, an inter- active AI-powered chatbot is developed using the fine-tuned model, allowing users to submit privacy policy text and receive categorized outputs along with simplified explanations, thereby enhancing user comprehension. This research demonstrates how LLaMA 2 models—particularly when fine-tuned and quantized—can support scalable, interpretable, and user-centered privacy policy automation. It offers practical insights for AI developers, privacy professionals, and policymakers while laying the groundwork for future work on hybrid approaches that combine model compression, fine-tuning, and conversational interfaces for privacy and digital rights communication.
Recommended Citation
Malisetty, Bhavani, "IMPROVING THE UNDERSTANDING OF PRIVACY POLICIES USING LARGE LANGUAGE MODELS (LLMS)" (2025). Computer Science Theses, Dissertations, and Student Creative Activity. 6.
https://digitalcommons.unomaha.edu/compscistudent/6
Files over 3MB may be slow to open. For best results, right-click and select "save as..."
Comments
The author holds the copyright to this work. Any reuse or permission must be obtained from them directly.