Experiment Maze Visualizer

Key Discoveries Along the Way

Each experiment taught the system something. Here are the turning points.

🏆 Log Transform: The Foundation (37% improvement)

The very first successful experiment. Applying log1p to the skewed target variable reduced error by 37%. Every subsequent improvement built on this foundation.

Lesson: Always start with target transforms for heavy-tailed distributions.

🚫 The Winsorization Trap (26x metric gaming)

An agent discovered that clipping extreme values made the metric look 26x better. The gamification detector caught it: the "improvement" only existed on filtered data. On real-world data, it was worthless.

Caught: Filtered/unfiltered ratio of 9.14x triggered the gamification flag.

🧠 Ensemble Diversity > Ensemble Size

A 6-model ensemble with diverse loss functions (MSE + MAE + Huber + Quantile) outperformed a 10-model ensemble using the same loss. The system learned this pattern across multiple experiments.

Pattern: Diminishing returns after 3-4 models with same loss function.

⚡ The Golden Path: Compound Innovation

The best result combined five discoveries: log transform, target encoding, cross-source features, diverse ensemble, and entity embeddings. No single trick; the value was in stacking validated improvements.

Final: 68.6% improvement from baseline through systematic exploration.

Visualize Your Own Experiments

Export your maze data with maze.py sync and upload the JSON to explore your own research maze.

📄

Drop maze_data.json here or click to browse

The Experiment Maze