📐 ETL Pipeline — Combine multiple clinical datasets before analysis. Supports row-append (same columns), column-join (shared patient ID), and smart merge (auto-detect common fields).
📂 Loaded Datasets
Upload files first to see datasets here
⚙️ Operation Configuration
Select an operation to see preview
📋 Operation Log
No operations performed yet
📊 Clinical Dataset Overview
Data Quality Score
🧩 Column Profiles
📋 Data Preview
↔ Scroll · columns · First 10 rows
📊 Numeric Summary
🏷️ Categorical Cardinality
🔬 Clinical Column Detection
🔵 Missing Value Heatmap
🔢 Data Type Distribution
🧹 Clinical Data Cleaning Pipeline
❓ Missing Value Handler Interactive
📋 Cleaning Audit Log
♻️ Duplicate Patient Records
⚠️ Clinical Outlier Detection (Z-score > 2.5σ)
🏥 Clinical Data Standardization Engine
🔬 Standardization Engine — Normalize disease names (ICD-10), drug names (generic ↔ brand), and measurement units (mg/mcg/IU). Follows WHO & HL7 FHIR standards for interoperability.
🦠 Disease Names
💊 Drug Names
📏 Units
🔢 ICD-10 Codes
🦠 Disease Name Normalization
Detected disease-related values in your dataset. Variants are mapped to WHO standard names.
💊 Drug Name Standardization (Brand → Generic)
Detected brand/informal drug names. Mapped to WHO International Nonproprietary Names (INN).
📏 Measurement Unit Standardization
Converts and validates dosage units. Flags potentially dangerous dose discrepancies.
🔢 ICD-10 Code Lookup
Look up and validate ICD-10 diagnosis codes in your dataset.
📈 Exploratory Data Analysis (EDA)
📐 Descriptive Statistics
Column:
🔗 Clinical Correlations
🧪 Distribution & Normality
📊 Hypothesis Testing
📊 Value Distribution Histogram
📦 Avg Value by Clinical Group
🎯 Disease Prevalence by Group
🔵 Clinical Scatter Plot
🧬 Feature Store
🧬 Feature Engineering — Creates clinically meaningful derived features before ML. Industry standard: clean, reusable features → Feature Store → ML pipeline. Avoids training-serving skew.
⭐ Engineered Clinical Features
📊 Feature Importance Preview
📋 Feature Store Preview Table
🤖 ML Operations
📖 ML Educational Module — Simulates clinical ML training. Auto-selects best algorithm based on task type & data characteristics, or choose manually. Includes full parameter documentation.
🎯 Model Configuration
Ready to train
📈 Feature Importance
📚 Algorithm Documentation
🔬 Algorithm: Random Forest
🧮 Key Formulas
📥 Input Parameters
Parameter
Type
Description
📤 Output Parameters
Output
Type
Description
⚠️ Error Metrics
🔄 Workflow
📖 Clinical References
💡 Clinical ML Note: Simulated for education. In production: use Python/Scikit-learn, validate on prospective clinical data, follow FDA SaMD guidance, and document model lineage.
💬 Clinical AI Assistant
🔑 Anthropic API Key (optional)Memory-only · Never transmitted
❤️ Heart disease risks?
🧪 Cholesterol significance?
🎯 Target distribution?
🤖 Best ML for clinical?
❓ Missing data strategies?
⚖️ AI Ethics in healthcare?
🔢 ICD-10 codes?
⚠️ Patient risk scores?
💊 Drug interactions?
🧬 Feature engineering?
📊 Sensitivity/Specificity?
💡 Dataset summary?
ClinicalMind AI
👋 Hello! I'm your Clinical Analytics AI Assistant.I can help with: medical terminology · ICD-10 codes · drug name lookup · patient risk interpretation · ML model selection · statistical methods · data quality · ethical AI in healthcare.Upload a dataset and ask me anything — or use the quick-chips above for guided clinical insights!