# AI Crawler Directives for HealthArt # ============================================ # SITE METADATA # ============================================ Site-Name: HealthArt Site-URL: https://healthart.in Description: User-centric, privacy-focused personal health record platform that standardizes lab reports from different labs into a common format. Organizes, visualizes, and enables secure sharing of medical lab reports with family member management and sharing controls. Corrects confusing lab terminology and fills in missing information. Purpose: Personal health data management, medical report repository, health trend visualization, family health tracking, lab report standardization, data accuracy improvement Topics: Health Technology, Medical Records, Digital Health, AI in Healthcare, Lab Reports, Health Data Visualization, Personal Health Records, Quantified Self Target-Audience: Individuals managing personal health data, families tracking multiple members' health records, patients sharing reports with doctors # ============================================ # CONTEXT & DOCUMENTATION FILES # ============================================ # Primary documentation for AI agents and LLMs llms-txt: /llms.txt llms-txt-description: Concise overview of HealthArt features and capabilities llms-full-txt: /llms-full.txt llms-full-txt-description: Comprehensive documentation with detailed feature descriptions, setup guides, and technical notes # Site structure sitemap: /sitemap.xml sitemap-description: Complete list of public and discovery URLs robots: /robots.txt robots-description: Crawler access rules and restrictions # ============================================ # CRAWL DIRECTIVES # ============================================ User-Agent: * # Allow all public informational pages Allow: / Allow: /tests Allow: /tests/* Allow: /labs Allow: /similar-tests Allow: /privacy-policy # Allow discovery files Allow: /llms.txt Allow: /llms-full.txt Allow: /ai.txt Allow: /sitemap.xml # Disallow authenticated app pages (require login) Disallow: /dashboard Disallow: /reports Disallow: /profile Disallow: /share/create Disallow: /share/* Disallow: /gmail/* # Disallow share view pages (dynamic, require PIN) Disallow: /s/* # ============================================ # IMPORTANT ROUTING NOTES # ============================================ # The root route (/) is the public branded entry and sign-in page # The authenticated app home is /dashboard # For public content, focus on: /tests, /labs, /similar-tests, /privacy-policy # ============================================ # CONTENT STRUCTURE & ENTITY TYPES # ============================================ # Public Pages Entity Types: Entity-Types: MedicalTest, DiagnosticLab, HealthFeature, TestCategory, LabAccreditation, MedicalTerm # Medical Tests (/tests) # - 19+ common medical tests organized by category # - Categories: Blood Tests, Vitamin Tests, Hormonal Tests, Infectious Disease Tests, Urine Tests, Imaging Tests, Cardiac Tests # - Each test includes: name, description, purpose, normal ranges, pricing # Diagnostic Labs (/labs) # - 60+ major diagnostic lab chains in India # - Information: name, accreditations (NABL, CAP, ISO), services, contact details # Similar Tests (/similar-tests) # - Confusing medical terms explained # - Test synonyms and alternate names # - Comparison guides # ============================================ # APP FEATURES (for understanding, not crawling) # ============================================ # Core Features: # - Report upload (manual, WhatsApp, email, Gmail Auto-Import) # - AI-powered data extraction from PDFs and images # - Smart standardization: Consolidates reports from different labs into unified format # - Accuracy improvements: Corrects confusing terminology, standardizes test names, fills missing info # - Chart and table visualization of health trends with normalized data # - Family member management with data isolation # - Secure sharing with PIN protection and expiry controls # - Report management and status tracking # ============================================ # CONTENT EXTRACTION HINTS # ============================================ # When processing /tests pages, extract: # - Test name and abbreviations # - Category classification # - Description and purpose # - Normal reference ranges # - Price ranges # - Related tests # When processing /labs pages, extract: # - Lab name and brand # - Accreditation types (NABL, CAP, ISO) # - Service offerings # - Contact information # - Geographic coverage # When processing /similar-tests page, extract: # - Medical term pairs (confusing terms) # - Synonym groups # - Clarification explanations # ============================================ # PRIVACY & DATA HANDLING # ============================================ # For privacy information, see: /privacy-policy # Key points: # - No selling of personal or health data # - No use of data for advertising # - Optional upload methods with explicit consent # - User control over data deletion and sharing # ============================================ # TECHNICAL NOTES # ============================================ # Authentication: Google OAuth 2.0 # Frontend: React + Vite # Routing: React Router (client-side routing) # Public SEO pages are prerendered as static HTML at build time # Share links (/s/:shareId) are public but may require PIN