Genetics AI Platform Blog

1. Genetics LLM

A Domain-adapted large language model specialized for genetics and molecular biology research. The model demonstrates improved accuracy and contextual understanding on domain-specific queries compared to general-purpose LLMs.

Training:

Base Model: Qwen2-1.5B with LoRA fine-tuning (rank=16, alpha=32) for parameter-efficient training without catastrophic forgetting
Dataset: Curated 98K Q&A pairs covering CRISPR mechanisms, DNA sequencing technologies, hereditary diseases, gene expression, and molecular diagnostics
Infrastructure: 3 epochs on NVIDIA A100 GPU (40GB VRAM) via HuggingFace training infrastructure with mixed-precision (bf16)

Qwen2-1.5B LoRA/PEFT HuggingFace PyTorch

Model Card Architecture Dataset

2. Genetics API (inference)

Production-grade REST API providing scalable, low-latency access to the Genetic LLM for real-time inference.

Architecture:

Framework: FastAPI with async request handling and automatic OpenAPI documentation
Backend: HuggingFace Inference Endpoints with GPU acceleration (NVIDIA T4)
Security: API key-based authentication with rate limiting and request validation

Infrastructure:

Compute: AWS EKS (Kubernetes) with 2-node managed cluster
Network: Application Load Balancer with SSL/TLS termination via ACM
CI/CD: GitHub Actions pipeline with automated ECR push and kubectl rollout

FastAPI HF Endpoints Docker EKS

Source Code OpenAPI Docs

3. Genetics AI Assistant (Chatbot)

Interactive web application enabling natural language conversations with the Genetic LLM for researchers and students.

Stack:

Frontend: React 18 with TypeScript for type-safe development, built with Vite
Styling: Bootstrap CSS with custom design system
Deployment: S3 + CloudFront for global CDN distribution

Features:

Real-time streaming responses with typing indicators
Conversation history with local storage persistence
Mobile-responsive layout with example prompts

React 18 TypeScript Bootstrap CloudFront

Source Code Live Demo

4. AI Vision/Image Metadata Studio for Machine Learning

Professional image annotation tool for creating high-quality labeled datasets for machine learning models. Supports polygon annotations, time-lapse comparison with auto-detection, and multiple export formats.

Generate labeled datasets for Vision Transformers, Diffusion Models, and Object Detection
Annotate individual images with regions and points
Add captions and metadata for training
Export in COCO JSON, YOLO, HuggingFace JSONL, and CSV formats
Import metadata files to restore annotations for any image

Annotation Features:

Tools: Polygon regions, point markers, measurements with scale calibration
Metadata: Labels, notes, confidence scores, captions, and tags
Histopathologic Images: Uses Cellpose API for automated cell/nuclei detection and cell segmentation in microscopy images with Cellpose's nuclei and cyto3 models
Export: COCO JSON, YOLO, HuggingFace JSONL, CSV formats

Time-Lapse Comparison:

Auto-Detection: Region tracking across image series using color profiling
Analytics: Area progression charts with growth statistics
Demo Data: Histopathology samples (Lung, Liver, Breast, Cancer)

Infrastructure:

Frontend: React 19 + TypeScript + Vite + Canvas API
Backend: AWS SAM (Lambda + API Gateway + DynamoDB)
Deployment: S3 + CloudFront with GitHub Actions CI/CD

React 19 TypeScript Canvas API AWS SAM Lambda S3 CloudFront

🏷️ Live Demo 📊 Time-Lapse View Features Use Cases

System Architecture

1. Genetics LLM

2. Genetics API (inference)

3. Genetics AI Assistant (Chatbot)

4. AI Vision/Image Metadata Studio for Machine Learning