Case Studies

LLM Financial Analyst

OVERVIEW

Developed an LLM-Driven Financial Copilot designed to enhance hedge fund managers' decision-making. The model sources concrete evidence from institutional financial reports and qualifies user-provided claims, reducing the human effort required to distill, digest, and synthesize hundreds of pages of global macroeconomic financial research on a weekly basis. Our key contribution includes the technical ideation, design, and implementation of an intuitive web-based interface that allows a user to reason over a massive volume of recent research reports from financial institutions.

KEY OUTCOMES

Intuitive UI: A minimal dashboard enabling bulk PDF uploads, user textual input, display of model output, including citations of evidence in support of or against a user-provided thesis.

Domain Knowledge: The model understands financial jargon, industry-specific terminology, abbreviations, and aliases of global financial institutions. The model offers evidence-based justification with configurable summary length and complexity.

Attribute Extraction: Entity extraction including research authors, institutions, dates, securities, market events, and financial authority activity. Statements are decomposed into verifiable claims with temporal reasoning capabilities.

System Performance: Scalable system with minimal latency, distributed hosting of fine-tuned LLMs, and siloed remote storage for documents, user inputs, model responses, and product feedback.

Financial Data Ingestion

OVERVIEW

Overhauled a FinTech company’s ledger and reconciliation platform within their AWS infrastructure. Developed a reliable, efficient, and extensible data ingestion process for loan data to reliably and responsibly maintain outstanding loan payments and timelines.

Our key contribution includes the implementation of automated data ingestion for an evolving list of loan vendors, including the US Department of Commerce, Fiserv, FSA, SBA, Viva, IFG, Funding Circle, Auxilior, Primis, and Libertas. Relative knowledge time-stamping ensures that relevant data is accurately captured and processed in a consistent and reproducible manner.

KEY OUTCOMES

Operational Robustness: Automated backfilling and validation processes to ensure accurately ingested data and reducing the need for manual correction.

Scalable and Efficient Leveraged AWS infrastructure to create a scalable, extensible, and efficient data ingestion pipeline capable of handling high data volume and frequency over time.

Engineering Leadership: Demonstrated effective engineering leadership through the successful implementation of systematic fault-tolerant data ingestion and extensible representation of data providers.

Production Readiness: Addressed production-specific nuances to ensure automated validation, error handling, and recovery mechanisms to minimize system downtime and enforce provider-specific schemas.

LLM Tax Preparation

OVERVIEW

B2B software service for automated preparation of federal IRS tax returns to compete with incumbent tax preparation software. Designed and implemented Directed Acyclic Graph (DAG) construction, validation, and composition process with LLMs to provide a robust production system for automated tax form population and validation.

Our key contribution includes the development of programmatic templates to capture values from IRS tax forms while ensuring relational consistency and structural validity. The product offers dynamic DAG construction with human-in-the-loop validation to accelerate the preparation of federal IRS tax returns for individuals and organizations.

KEY OUTCOMES

Automated and Accurate Form Processing: High-accuracy extraction and processing of IRS tax forms to ensure minimal errors and increased tax preparation throughput.

Robust Ontology Mapping: Advanced mapping techniques to enhance system comprehension of nuanced tax codes, concepts, and terminology.

Dynamically Scalable DAG Construction: Efficient extraction and processing of structural relationships of tax form data to ensure scalability and robustness.

Unified Data Representation: Comprehensive global DAG aggregation to provide consistent and accurate representation of tax form fields and inter-form dependencies.

Human-in-the-loop Validation: Expert intervention and refinement to validate system outputs and minimize uncertainty propagation.

Our Publications

A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT

Generating Synthetic Returns Conditioned Off Macroeconomic Features with Variational Autoencoders

Semantic Compression with Large Language Models

Using LSTM Networks and Future Gradient Values to Forecast Heart Rate in Biking

Towards secure cyber-physical information association for parts

Cyber-Physical Component Verification with Global Collision Estimation Through Markov Integration