Saranya Vijayakumar

Ph.D. Candidate in Computer Science
Carnegie Mellon University
saranyav [at] andrew.cmu.edu

Download CV
Saranya Vijayakumar

Research Vision

My research focuses on enhancing the security, privacy, and interpretability of AI systems through the integration of formal methods and neural approaches. I develop theoretical frameworks and practical tools that help us understand and improve the robustness of modern AI systems, with applications in fraud detection, program analysis, and secure machine learning.

I am fortunate to be advised by Christos Faloutsos and Matt Fredrikson. Previously, I did my undergraduate at Harvard, where I completed a joint concentration in computer science and government, working with Cynthia Dwork and Jim Waldo on my thesis. After Harvard, I spent three years as an associate at Goldman Sachs. I am currently supported by the Department of Defense National Defense Science and Engineering Graduate Fellowship.

Core research areas: AI Security & Privacy Neural-Symbolic Methods Applied ML Systems

Research

Aligned LLMs Are Not Aligned Browser Agents

ICLR 2025

Priyanshu Kumar, Saranya Vijayakumar, Elaine Lau, Tu Trinh, Zifan Wang, Matt Fredrikson

Our work investigates whether LLMs' safety training, which makes them refuse harmful instructions in chat contexts, extends to non-chat agent scenarios, particularly browser agents. We show that while LLMs may refuse harmful requests in chat form, the same LLMs when used as browser agents often fail to maintain these safety refusals. Through extensive testing with BrowserART (Browser Agent Red teaming Toolkit) on 100 browser-based harmful behaviors, we demonstrate that browser agents pursue many harmful behaviors that would be refused in chat contexts. We find that GPT-4o and o1-preview based browser agents pursued 98 and 63 harmful behaviors respectively (out of 100) under certain attack conditions, indicating that chat-based safety training does not sufficiently transfer to agent contexts.

PDF

Grounding Neural Inference with Satisfiability Modulo Theories

NeurIPS 2023 Spotlight

Zifan Wang*, Saranya Vijayakumar*, Kaiji Lu, Vijay Ganesh, Somesh Jha, Matt Fredrikson

Paper visualization

A novel framework combining neural networks with SMT solvers for enhanced reasoning capabilities while maintaining computational efficiency. Achieved 15% improvement in accuracy on complex reasoning tasks.

Neural-Symbolic AI Formal Methods Machine Learning

CallMine: Fraud Detection and Visualization of Million-Scale Call Graphs

CIKM 2023

Mirela Cazzolato, Saranya Vijayakumar, Meng-Chieh Lee, Namyong Park, Catalina Vajiac, Christos Faloutsos

CallMine visualization

A system for detecting and visualizing fraud patterns in large-scale telecommunication networks, combining advanced graph mining with interactive visualizations. Successfully deployed with a Portuguese telecom provider.

Graph Mining Fraud Detection Visualization

MalCentroid: Tracking Malware Evolution Through Behavioral Primitive Decomposition

Under Review CCS 2025

Saranya Vijayakumar*, Matt Fredrikson, Christos Faloutsos

Paper visualization

This paper introduces MalCentroid, a novel framework for tracking malware evolution through behavioral analysis. While existing malware detection approaches achieve high accuracy on known samples, they typically treat each sample in isolation and rely on surface features that are easily manipulated. MalCentroid addresses both limitations by decomposing malware samples into behavioral primitives extracted from control flow graphs and tracking their evolution through a centroid-based embedding space. The framework maintains multiple behavioral prototypes per malware family, enabling it to track behavioral drift over time and identify truly novel variants. Evaluation on large-scale datasets demonstrates two key advantages: the ability to track malware evolution patterns (showing families exhibiting drift rates up to 0.633 and uncovering parallel evolution across distinct families), and inherent robustness against adversarial manipulation (with most attack vectors achieving less than 5% success rate compared to 97% against image-based approaches). These results suggest that focusing on behavioral primitives rather than surface features provides both better evolution tracking and natural security benefits.

PDF

Benchmarking AI-Generated Code Detection

Under Review PAKDD 2025

Saranya Vijayakumar, Christos Faloutsos, Matt Fredrikson

A comprehensive evaluation of traditional, transformer-based, and visual-textual approaches for detecting AI-generated code across multiple programming languages.

PDF

Mechanistically Interpreting a Transformer-based 2-SAT Solver

Under Review ICML 2025

Nils Palumbo, Ravi Mangal, Zifan Wang, Saranya Vijayakumar, Corina Pasareanau, Somesh Jha

In this work we present a formal framework for mechanistically interpreting neural networks, motivated by abstract interpretation from program analysis. We apply this framework to analyze a Transformer model trained to solve the 2-SAT problem. Through our analysis, we uncover that the model learns a systematic algorithm - first parsing input formulas into clause-level representations in initial layers, then evaluating satisfiability by enumerating possible Boolean variable valuations. Our work provides evidence that the extracted mechanistic interpretation satisfies our proposed formal axioms, demonstrating how the model systematically solves 2-SAT problems.

PDF

Leveraging Large Language Models for Enhanced Membership Inference

Under Review 2025

Saranya Vijayakumar, Aman Priyanshu, Suriya Ganesh, Vy Tran, Yash Maurya, Hana Habib, Norman Sadeh

Invited Talks & Academic Service

Security and Privacy in Practice

Information Security, Privacy & Policy (17-331/631), November 21, 2024

Slides
  • Applied security analysis techniques
  • Real-world privacy challenges with LLMs
  • Jailbreaking and watermarking

AI Security and Governance

AI Governance Course (17-416/17-716), April 3, 2024

  • Current landscape of AI security challenges
  • Intersection of technical capabilities and governance frameworks
  • Emerging threats and mitigation strategies

AI Security, Robustness, and Privacy

Information Security, Privacy & Policy (17-331/631), December 5, 2023

  • Overview of current challenges in AI security
  • Discussion of robustness techniques and evaluation
  • Privacy considerations in modern AI systems

Academic Service

Information Security, Privacy & Policy (17-331/631), Fall 2024

Final Project Judge

  • Evaluated student projects on security and privacy implementations
  • Provided technical feedback and industry-relevant insights
  • Helped assess practical applicability of security solutions

Teaching

Teaching Philosophy

Teaching Statement PDF

I believe in creating an inclusive learning environment that emphasizes practical understanding and critical thinking. My teaching approach combines theoretical foundations with hands-on experience, preparing students for both academic and industry challenges.

Teaching Experience

Information Security, Privacy & Policy (17-331/631)
Fall 2023

Teaching Assistant

Instructors: Norman Sadeh and Hana Habib

Course Highlights:

  • Masters-level course covering security and privacy fundamentals
  • Led discussion sections on privacy policies and security frameworks
  • Mentored student projects in privacy and membership inference analysis
Rapid Prototyping Technologies (15-294) & Intermediate Rapid Prototyping (15-394)
Spring 2023

Teaching Assistant

Instructor: Dave Touretzky

Course Highlights:

  • Taught both introductory and intermediate prototyping techniques
  • Supervised hands-on laboratory sessions
  • Provided technical guidance for student projects

Teaching Development

Eberly Center Future Faculty Program

Selected participant in Carnegie Mellon's prestigious teaching development program

  • Completed intensive pedagogical training
  • Developed evidence-based teaching strategies
  • Created inclusive course design frameworks

Mentorship

  • High School Students:

    Mentored Philip Negrin on AI Code Detection research (project video) and a second student on their research project.

  • Masters Students at CMU:

    Guided a team of 4 students on privacy research analyzing Google's Topics API (USENIX PEPR '24).

Prepared to Teach

I am prepared to teach courses at both undergraduate and graduate levels in:

Core Courses:
  • Machine Learning
  • Computer Security
  • Artificial Intelligence
Specialized Topics:
  • Advanced Topics in AI Security
  • Machine Learning Systems
  • Neural-Symbolic Methods in AI

Research Projects & Impact

Large-Scale Fraud Detection Systems

TgrApp System Demo

TgrApp system visualization interface

Developed novel visualization and detection methods for analyzing million-scale fraud patterns in telecommunication networks, leading to deployed solutions with real-world impact.

Formal Methods in Security

Research Visits & Collaborations

Inria Nancy

Formal verification of the Olvid messaging protocol using ProVerif

National Security Applications

  • CSET Review: "Cybersecurity Risks of AI-Generated Code" for Georgetown University
  • Encryption Projects Research
  • Upcoming work on AI-generated code detection (Under Review at PAKDD 2025)

Algorithmic Fairness & Ethics

Public Impact & Research

Early work on fairness in algorithmic decision-making systems, combining technical analysis with policy implications.

Privacy-Preserving Technologies

Topics API Privacy Analysis

Investigating privacy vulnerabilities in Google's Topics API through novel LLM-based approaches

  • Enhanced membership inference techniques
  • Novel reidentification methods
  • Privacy implications for web advertising

Under Review, 2025

Adversarial Privacy Attacks

PEPR 2024 Publication

Novel techniques for evaluating and enhancing privacy protections in modern web APIs

Research Impact

  • Development of new privacy attack vectors using large language models
  • Practical implications for web privacy infrastructure
  • Contributions to privacy-preserving advertising technologies

Service & Leadership

Academic Community Building

  • 90+ student participants; Founded and lead weekly lunch program for women and non-binary PhDs in SCS
  • 30+ Papers reviewed for ICLR, ICML, KDD, and NeurIPS

Outreach & Impact

Awards & Recognition

2024

Best Poster Award

NDSEG Annual Fellows Conference

View Award Announcement →
2023

Graduate Fellowship for STEM Diversity (GFSD)

NSA (Declined)

2022 - 2024

National Defense Science & Engineering Graduate Fellowship

Army Research Office

2021 - 2023

Future Faculty Program

Carnegie Mellon University Eberly Center

Research & Industry Experience

Research Scientist, Quantitative Trading

2018 - 2021

Goldman Sachs - New York, NY

Research Contributions

  • Developed novel statistical methods for analyzing market microstructure, processing 100M+ daily trading records to identify systematic patterns in algorithmic execution
  • Led research initiatives on latency-sensitive distributed systems, resulting in peer-reviewed internal publications on network optimization
  • Architected real-time analytics pipelines using Python and KDB/Q, implementing novel algorithms for trade execution optimization

Systems & Infrastructure

  • Designed distributed computing framework for processing market data streams across multiple data centers
  • Implemented machine learning models for real-time trade classification and risk monitoring
  • Built visualization tools for analyzing high-dimensional financial data, later adopted across multiple trading desks
Distributed Systems Machine Learning High-Performance Computing Data Visualization

Data Scientist

2022

Beto O'Rourke Senate Campaign

Technical Leadership

  • Developed novel machine learning models for voter behavior prediction using privacy-preserving techniques on sensitive demographic data
  • Architected scalable data pipeline processing 20M+ voter records while maintaining strict privacy guarantees
  • Led technical implementation of real-time analytics dashboard for resource allocation optimization

Research Impact

  • Developed new methodologies for privacy-preserving analysis of demographic data, informing current research in differential privacy
  • Created novel visualization techniques for geographic and demographic data analysis
  • Implemented statistical methods for causal inference in voter outreach effectiveness
Privacy-Preserving ML Large-Scale Data Analysis Causal Inference

Technical Expertise

Core Technologies

Python Java C TensorFlow scikit-learn PyTorch ProVerif Slang

Languages

English
Native
Tamil
Native
Spanish
Professional
Japanese
Limited

Interests

Art Resume

Art Resume PDF

Contact

saranyav [at] andrew.cmu.edu

Links

Location

Gates Hillman Center
Carnegie Mellon University
Pittsburgh, PA 15213

© 2024 Saranya Vijayakumar. Last updated: December 2024