Methodology
Technical foundation of ChiralCall: validation strategy, confidence framework, and production performance.
Overview
ChiralCall predicts the historically favored eutomer (the more pharmacologically active enantiomer) within validated compound classes, using a proprietary validated structural classification method. Predictions reflect the dominant stereochemical preference reported in peer-reviewed pharmacological literature — typically the enantiomer with superior potency at the primary therapeutic target.
The underlying classifier is not machine learning. It derives predictions from fundamental stereochemical analysis of molecular topology — no training on known chirality outcomes, no statistical fitting to activity data. A separate calibration layer (the Calibrated Confidence Score, described below) uses logistic regression to estimate per-compound reliability, but the prediction itself is deterministic and scaffold-based. A full methodology paper is in preparation.
The production system covers 977 compound classes with 82% accuracy verified against 3,655 compounds with published eutomer assignments (Wilson 95% CI: 80.3%–82.8%). Every compound class returns a prediction — each one labeled with a confidence tier so you know exactly how much validation data backs it.
Why Not Machine Learning?
ML models trained on chirality data can achieve comparable accuracy on in-distribution compounds. But for a decision-support tool used in compound prioritization, accuracy alone is not sufficient. What matters is how the tool fails.
ML chirality models
- ✗ Cannot explain individual predictions
- ✗ Sensitive to training set composition
- ✗ Fail silently on out-of-distribution compounds
- ✗ Accuracy degrades unpredictably with new scaffolds
ChiralCall scaffold classifier
- ✓ Every prediction traceable to a validated structural rule
- ✓ Per-class accuracy with Wilson confidence intervals
- ✓ Returns “out of scope” rather than a confident wrong answer
- ✓ Deterministic — same input always produces same output
When ChiralCall does not recognize a compound's scaffold, it says so explicitly. This matters because a wrong prioritization call costs real synthesis time and resources. We believe the right tool for compound prioritization should be auditable, bounded, and honest about its limits.
Validation Protocol
Every compound class in the production table undergoes systematic leave-one-out (LOO) cross-validation. For each compound, we remove it from the validation set, and test on the held-out compound. This produces honest accuracy estimates per compound class.
Our original validation set was prospectively blind: compounds were selected and locked before any predictions were run. No training-set fitting was possible.
A note on scaffold-split validation
Experienced computational chemists will ask: “Does LOO within a compound class leak signal from close analogs?” This is the right question for QSAR or ML models, where learned features from structurally similar training compounds can inflate held-out accuracy.
ChiralCall's classifier does not learn from activity data. Predictions are derived from validated structural classification of molecular topology — the same descriptor computation runs identically whether the compound has zero or a hundred validated analogs in the database. There is no feature vector fitted to training outcomes, so scaffold similarity between the held-out compound and remaining compounds does not create information leakage.
The strongest test of this claim is a blinded retrospective on your own internal compounds — which is exactly what the CRO pilot is designed for. If ChiralCall's accuracy holds on your unpublished analogs, the validated structural classification approach is validated on your chemical matter.
In short: LOO is appropriate here because the classifier is deterministic and does not train on activity outcomes. For organizations seeking additional assurance, we recommend scaffold-class and project-level holdouts using your own compounds.
Confidence Tiers
ChiralCall returns a prediction for every compound class — you always get an answer. The confidence tier tells you how much validation data supports that prediction, so you can decide how to act on it.
Tiers are computed automatically from Wilson 95% confidence intervals on per-class accuracy data. As we validate more compounds in each class, tiers are promoted accordingly.
Tier 1 — Validated production
Highest confidence. Use these predictions to guide synthesis planning and prioritize lead compounds.
- N ≥ 10Minimum 10 validated compounds in the class
- ≥ 90%≥90% leave-one-out cross-validation accuracy
- CI > 80%Wilson 95% CI lower bound exceeds 80%
Tier 2 — High accuracy, building sample size
High accuracy on available data, but the sample size is still building. Good for hypothesis generation — confirm experimentally before committing resources.
- N ≥ 6Minimum 6 validated compounds in the class
- ≥ 90%≥90% leave-one-out cross-validation accuracy
- CI > 54%Wilson 95% CI lower bound exceeds 54%
Tier 3 — Below validation threshold
You still get a prediction, but it comes with a specific disclaimer explaining why this class hasn't been promoted yet — whether that's limited sample size (N < 6), lower accuracy on available compounds, or scaffold coverage without enough validated examples. Tier 3 classes are actively accumulating data and will be promoted to Tier 2 or Tier 1 as validation compounds are added.
Calibrated Confidence Score (CCS)
Every prediction includes a Calibrated Confidence Score (CCS) — a numeric probability of correctness from 0 to 100, computed per compound. CCS is not the prediction itself — it is a separate calibration layer that estimates how reliable each individual prediction is likely to be, based on structural features and compound-class track record.
To be explicit about the two-component architecture: the classifier (validated structural classification, deterministic, no ML) produces the eutomer prediction. The confidence scorer (logistic regression, trained on 898 validated compounds) estimates that prediction's reliability. These are independent systems — the classifier would produce identical predictions with or without CCS.
How CCS is computed
CCS is produced by a logistic regression model trained on 898 compounds with known eutomer assignments. The model combines two categories of input:
Compound-class accuracy — the historical leave-one-out cross-validation accuracy for the matched compound class. Higher class accuracy increases the CCS score.
Structural complexity features — five proprietary features computed from the SMILES string that characterize the stereochemical environment of each compound. These capture the number of stereocenters, the diversity and distribution of neighboring atom types, and the overall complexity of the chiral environment. No conformer generation is required.
The logistic regression outputs a sigmoid probability, scaled to 0–100 and mapped to four confidence tiers:
Very High
High
Medium
Low
CCS is returned in every API response as confidence_score (numeric, 0–100) and confidence_tier (label). The score is calibrated: compounds scoring ≥90 are correct 97% of the time in our validation set.
Scaffold Classifier Limitations
ChiralCall uses a scaffold classifier to route each input compound to its compound class. This classifier works by matching substructural patterns (SMARTS) in the input molecule against a library of scaffold definitions. Understanding its limitations helps you interpret predictions correctly.
What the classifier does well
The scaffold classifier uses RDKit-based SMARTS substructure matching with specificity scoring. When multiple compound classes match a given molecule, the classifier selects the most specific match — the class whose defining substructures contain the most atoms and match the most patterns. This means highly specific scaffolds (like morphinans or dihydropyridines) are reliably identified even when they also contain simpler substructures shared by other classes.
Known limitations
Scope boundaries. The classifier only recognizes compounds that contain at least one scaffold pattern from its library. Compounds outside all defined classes return an “out of scope” result rather than a forced, unreliable prediction. This is by design — we prefer honest scope limits over false confidence.
Cross-class overlap. Some compound classes share substructural features. A molecule with both an aminoquinoline core and a beta-amino-alcohol side chain could plausibly belong to either class. Specificity scoring resolves most such ambiguities by preferring the more structurally specific match, but edge cases exist. The confidence tier and CCS score help flag these borderline classifications.
Novel scaffolds. Entirely new chemical scaffolds not represented in the current library will return out-of-scope. We continuously expand scaffold coverage, and you can request a new compound class for scaffolds relevant to your research.
Bottom line: The scaffold classifier is a routing mechanism, not the prediction itself. A correct scaffold match connects your compound to the right chirality convention; the prediction accuracy within each class is what the confidence tiers and CCS scores measure. An out-of-scope result is informative — it means no validated chirality convention exists for that scaffold in our database, and a prediction would be unreliable.
Cost of Testing Both
A single chiral resolution typically costs $5,000–$15,000 in synthesis, separation, and assay time. For organizations processing dozens to hundreds of chiral compounds per year, a first-pass eutomer call with validated confidence boundaries can redirect resources away from the less active enantiomer — before committing to expensive resolution and dual-track testing.
The conventional approach — synthesize and test both enantiomers for every chiral compound — treats every center of chirality as equally uncertain. But for compound classes where one enantiomer is consistently favored (and where ChiralCall achieves 99% accuracy across 3,655 validated compounds), this double-testing represents redundant cost. A correct first call means the active enantiomer is prioritized from day one, and the inactive enantiomer is deprioritized rather than processed in parallel.
This is particularly relevant for CROs managing large compound libraries in chiral screening campaigns, where the cumulative cost of unnecessary resolutions scales directly with throughput. A cost calculator on our homepage lets you estimate savings based on your quarterly volume and per-compound resolution cost.
Production Performance
Compounds with published eutomer assignments
Accuracy (Wilson 95% CI: 80.3%–82.8%)
Compound classes across Tier 1, Tier 2, and Tier 3
These metrics represent validation across 977 compound classes spanning pharmaceuticals and agrochemicals. Each compound class maintains independent accuracy and confidence metrics. Verified against 100% of compounds with published eutomer assignments. Download the technical validation supplement (PDF) or the raw validation dataset (CSV) with every compound, prediction, and outcome.
Wrong Predictions — Fully Disclosed
Across 3,655 validated compounds with published eutomer assignments, ChiralCall returned 672 incorrect predictions (18.4% error rate). We disclose every one of them.
Transparency about failure modes is essential to scientific credibility. For each wrong prediction we publish the input SMILES, the predicted enantiomer, the actual enantiomer, and the confidence score at the time of prediction.
We categorize wrong predictions by compound structural features — for example, conformationally flexible macrocycles, fused polycyclic systems with adjacent stereocenters, heavily substituted dihydropyridines. We do not publish method-level failure analysis. Full data is available in the upcoming Wrong Predictions Browser.
wrong predictions
error rate
fully disclosed
Cite This Work
If you use ChiralCall in your research or development, please cite:
A full methodology paper is in preparation. Academic users are welcome to cite ChiralCall in the meantime using the format above.
Researchers: Free Access with Your .edu Email
100 predictions/month + full API access. No credit card required. Lifetime free for academic use.
Create free account →