Bigsnarfdude

I build systems that can think, listen, and see.
Focused on creating practical AI applications and tools for analyzing multimodal content.
Extracting insights from video content with local LLMs and developing interactive systems that understand visual and textual data.
BloomFilter, TDigest, HyperLogLog, CuckooFilter, and CountMinSketch are my favorite Algebras.
Kaggler. https://www.kaggle.com/vincento. @HackerSchool alumnus. Spaces+VIM.

AI safety and reasoning systems for LLM safeguards

Wizard101 - AI Safety Research

Comprehensive AI safety project implementing GuardReasoner, a state-of-the-art reasoning-based LLM safety classifier achieving 84% F1 score. Features two-stage training (R-SFT + HS-DPO), cost-effective data generation using Gemini 2.0 (600× cheaper than GPT-4o), and transparent step-by-step reasoning for safety decisions. Currently achieving 59% accuracy with LLaMA 3.2-3B, targeting 80-85% with full pipeline.

New York City skyline representing urban AI and text generation applications

Text Diffusion Models

Exploring diffusion models for text generation, applying techniques from image generation to natural language. This work investigates novel approaches to controlled text synthesis using diffusion processes, bridging computer vision and NLP methodologies.

Security monitoring representing LLM abuse detection and security patterns

LLM Abuse Patterns

Cataloging and analyzing adversarial patterns, prompt injection techniques, and security vulnerabilities in large language models. This research focuses on identifying, documenting, and mitigating abuse vectors to build more robust and secure AI systems.

Data visualization dashboard representing probabilistic data structures for compliance monitoring

PaperTrail Modern

A modern Python/Flask event processing system for compliance monitoring, combining probabilistic data structures (HyperLogLog, Bloom Filters, TopK, Count-Min Sketch) with Algebird-style monoid abstractions. Enables memory-efficient real-time analytics for security and audit logging with privacy-preserving compliance for GDPR/CCPA scenarios.

Security monitoring setup representing AI-enhanced security and incident response systems

AI-Enhanced Security

Combining my security background with AI capabilities, I've developed tools for monitoring AWS CloudTrail logs, anomaly detection for compromised logins, and ML-based security monitoring solutions. Recently, I've created an MLLM-based Incident Response Agent to speed up evidence collection after data breaches.

Data visualization dashboard showing LLM monitoring and analysis metrics

LLM Monitoring & Analysis

Developing tools to monitor and analyze LLM responses and interactions. The chatMonitor project helps review and evaluate responses from language models, providing insights into their performance and behavior when processing video content.

Featured Projects

My recent work focuses on multimodal AI, LLMs, and developing robust evaluation frameworks for testing language models. I've created several benchmarks and evaluation tools to assess LLM performance across different domains. I primarily code in Python, with some Scala and C++ for specific applications. Here are some of my highlighted projects:

wizard101 - AI Safety

State-of-the-art reasoning-based LLM safety classifier implementing GuardReasoner. Two-stage training with cost-effective Gemini 2.0 data generation (600× cheaper than GPT-4o).

Learn more

text-diffusion

Diffusion models for text generation, applying image generation techniques to natural language processing.

Learn more

llm-abuse-patterns

Comprehensive catalog of adversarial patterns, prompt injection techniques, and security vulnerabilities in LLMs.

Learn more

papertrail-modern

Event processing system using probabilistic data structures for compliance monitoring with privacy-preserving analytics.

Learn more

chatMonitor

Monitoring tool for videoLLM interactions that analyzes chat patterns and performance metrics.

Learn more

syntheticLLM

Creating synthetic data for training specialized language models with focus on domain-specific knowledge.

Learn more

Incident-Response-Agent

MLLM-based IR agent that speeds up evidence collection after data breaches and provides automated security assistance.

Learn more

testing_harness

Comprehensive evaluation framework for assessing LLM capabilities, biases, and limitations with standardized test suites.

Learn more

math_benchmark

Evaluation framework for benchmarking mathematical reasoning capabilities of language models and neural networks.

Learn more

Technical Skills

Python; PyTorch; Diffusion Policies for robots; Vision LLMs; LLM Evaluation Frameworks; Computer Vision; OpenCV; AWS EC2, S3, Lambda; Azure Data Warehouse

I'm a gun slinger turned code slinger, writing stuff that humans & computers can read. Worked at RCMP, CIBC, and Deloitte. @TechStars Chicago 2013. Proud @HackerSchool alumnus Winter 2012. Competition Expert level Kaggler. https://www.kaggle.com/vincento

Focused on building state-of-the-art machine perception solutions for robotic vehicles and interactive AI systems.

SecOps -> CISO -> Security Engineer -> Data Engineer -> Machine Learning Engineer

Get in touch

Find me on Github