Machine Learning System · Solo Project · Python / Scikit-learn
A supervised machine learning system that predicts box office revenue and identifies optimal runtime ranges for maximizing audience ratings across film genres.
This system analyzes a dataset of ~4,800 film records to deliver two capabilities: revenue prediction based on budget, genre, and runtime, and runtime optimization analysis to identify the runtime ranges most correlated with high audience ratings per genre. Feature engineering and model selection were tuned to balance accuracy with interpretability.
Python · Pandas · NumPy · Scikit-learn · Random Forest · Gradient Boosting · Jupyter Notebook · Git
| Genre | Correlation (r) | Strength |
|---|---|---|
| Action | 0.468 | Strong |
| Thriller | 0.459 | Strong |
| Adventure | Moderate | Moderate |
| Drama | Moderate | Moderate |
| Horror | Moderate | Moderate |
| Comedy | Weak | Weak |
| Animation | Weak | Weak |
Copyright © Keith Tran.