Yulin Shi

AI Engineer & Photographer

Background

Bridging logic & emotion

Yulin Shi

Explore

Selected Projects

Building innovative solutions at the intersection of AI and real-world applications.

NLP

Legomnia

I developed the Legomnia platform, an advanced search engine designed to handle complex legal documents, leveraging ElasticSearch 8.9. The platform integrates vector-based document storage and employs a hybrid retriever that combines both syntactic and semantic search methods for optimal results. Key features include advanced query techniques that enhance the retrieval of relevant information from intricate legal corpora. The system also automates the extraction and indexing of metadata, improving document filtering and management.

PythonElasticSearch 8.9Vector SearchHybrid Retrieval
AI Agent

Remo AI Sales Agent

I'm building Remo, an AI agent that joins video calls with website visitors, talks to them by voice, and navigates the product website in real-time — clicking, scrolling, and showing features while the visitor watches. The backend runs as three independent processes communicating through Redis pub/sub: a Voice Worker handling conversation via LiveKit, a Browser Worker controlling headless Chrome instances streaming screenshots as live video, and an API Server for configuration and billing. The agent follows a state machine: qualify the visitor, plan a personalized demo, navigate the website section by section, narrate what's happening, answer questions at any time via RAG, and close with a meeting or trial offer. To reduce voice latency, predictable narrations are pre-generated when the agent is configured, with only dynamic parts going through the LLM in real-time.

FastAPILiveKitPlaywrightRedisRAGLLM
NLP

FeedPaper

I architected FeedPaper, an automated multi-agent research pipeline designed to streamline the discovery of daily academic papers. Sourcing real-time data from arXiv RSS feeds, the system indexes articles into a vector database, utilizing hybrid search to deliver highly relevant results based on user-defined topics. A key innovation is the integration of Large Language Models (LLMs) that analyze and articulate precisely why a specific paper is relevant to the user's interests. The platform features a fully automated backend powered by Celery, managing the end-to-end lifecycle from data scraping and indexing to the delivery of personalized daily email reports.

PythonCeleryLLM IntegrationVector DatabaseHybrid SearchMulti-Agent Systems

Written Synapses

Deconstructing algorithms, ethics, and the philosophy of intelligence.

EngineeringAI Architecture

Zero-Cost AI MVP

The backbone of this architecture is the choice of Supabase as the Backend-as-a-Service platform, which provides for free not only the relational database but also crucial services such as user authentication and file storage.

Hackathons

Competing and building under pressure — from AI challenges to live hackathons.

4th Prize Winner

NASA Breath Diagnostics Challenge

December 2024

Developed a classification model to analyze data from NASA E-Nose for distinguishing between COVID-positive and COVID-negative breath samples. Applied advanced data preparation and AI techniques to optimize performance despite limited sample size.

Machine LearningClassification
First Prize Winner

Cryptocurrency Price Prediction Challenge

June 2024

Conducted data analysis and preprocessing to ensure quality and relevance. Optimized prediction accuracy through model selection and concatenation and developed a high-performance predictive model to capture market trends.

Machine LearningDecision Tree
Finalist

Mistral AI Paris Hackathon 2024

May 2024

Developed a model that generates guitar tablature and MIDI files from initial measures, desired style, and key. Using an open-source dataset of guitar recordings, we fine-tuned the Mistral-7B model for coherent musical continuations.

LLMFine-TuningQLoRAQuantization