.. CricCatapult documentation master file CricCatapult ============ **The Modern Cricket Analytics Platform** Production-grade machine learning for cricket data. Train models, make predictions, analyze matches - all from a single Python library. Trusted by data scientists, AI engineers, and cricket analysts worldwide. ---- Why CricCatapult? ================= **Pre-Trained Models** ML models work instantly. No training required, no API keys, no costs. **Lightning Fast** Sub-10ms predictions. Works offline. Production-ready. **AI-Native** Built for AI agents, automation, and modern workflows. JSON-first API. **Battle-Tested** Presented at Carnegie Mellon Sports Analytics Conference. Used in production. ---- Quick Start =========== Install ------- .. code-block:: bash pip install criccatapult That's it. Models are included. Make Predictions ---------------- .. code-block:: python from CricCatapult.ml import get_predictor predictor = get_predictor() # Predict match outcome result = predictor.predict_match_outcome("India", "Australia") print(f"{result['predicted_winner']} wins with {result['confidence']}% confidence") # Predict player performance from CricCatapult.ml import PlayerStatsSynthesizer stats = PlayerStatsSynthesizer.get_player_stats("Virat Kohli") performance = predictor.predict_player_performance(stats) print(f"Predicted: {performance['predicted_runs']} runs") Download Data ------------- .. code-block:: python from CricCatapult import Cricsheet cs = Cricsheet() cs.IPL_csv() # Download all IPL data .. code-block:: bash # Or use the CLI criccatapult-cli cricsheet --type ipl --format json ---- For AI Agents ============= CricCatapult is designed for AI agents and automation workflows. **Structured Output** .. code-block:: bash criccatapult-cli --format json predict-match \ --team1 "India" --team2 "Australia" .. code-block:: json { "predicted_winner": "India", "confidence": 55.4, "win_probability": { "India": 55.4, "Australia": 44.6 } } **Natural Language** .. code-block:: bash criccatapult-cli ask "Who won the last IPL?" criccatapult-cli ask "Predict India vs Pakistan" criccatapult-cli ask "Show Virat Kohli stats" **Production Ready** .. code-block:: python from flask import Flask, request, jsonify from CricCatapult.ml import get_predictor app = Flask(__name__) predictor = get_predictor() @app.route('/predict', methods=['POST']) def predict(): data = request.json result = predictor.predict_match_outcome( data['team1'], data['team2'] ) return jsonify(result) ---- Machine Learning ================ Player Performance Prediction ------------------------------ **XGBoost Regressor** Predicts runs scored in next innings with confidence intervals. **Features** Historical averages, strike rates, recent form, consistency metrics, momentum indicators. **Performance** Mean Absolute Error: 15 runs. Inference: <10ms. .. code-block:: python from CricCatapult.ml import get_predictor, PlayerStatsSynthesizer predictor = get_predictor() stats = PlayerStatsSynthesizer.get_player_stats("Rohit Sharma") prediction = predictor.predict_player_performance(stats) # Returns: predicted_runs, confidence, confidence_interval Match Outcome Prediction ------------------------- **Random Forest Classifier** Predicts match winner with win probability percentages. **Features** Team strength, historical win rates, toss decisions, venue factors. **Performance** Accuracy: 58.5%. Inference: <10ms. .. code-block:: python predictor = get_predictor() result = predictor.predict_match_outcome("England", "Pakistan") # Returns: predicted_winner, confidence, win_probability Feature Engineering ------------------- Extract features from raw cricket data for custom models. .. code-block:: python from CricCatapult.ml import CricketFeatureEngineer, load_cricsheet_data df = load_cricsheet_data('ipl_matches.csv') engineer = CricketFeatureEngineer() # Extract batting features batting = engineer.extract_batting_features(df, "Virat Kohli") # career_avg, career_sr, boundary_pct, powerplay_sr, death_sr, etc. # Extract bowling features bowling = engineer.extract_bowling_features(df, "Jasprit Bumrah") # economy, wickets, dot_ball_pct, death_economy, etc. Custom Training --------------- Fine-tune models on your data using Google Colab (free GPUs). **Step 1**: Upload notebook from ``notebooks_training/`` **Step 2**: Upload your Cricsheet CSV data **Step 3**: Run all cells (5-10 minutes) **Step 4**: Download and deploy .. code-block:: bash # Replace pre-trained model mv your_model.pkl CricCatapult/models/player_performance_model.pkl # Predictions now use your model criccatapult-cli predict-player --player "Your Player" ---- Command Line Interface ====================== ML Predictions -------------- .. code-block:: bash # Player performance criccatapult-cli predict-player --player "Virat Kohli" # Match outcome criccatapult-cli predict-match --team1 "India" --team2 "Australia" # Structured JSON output criccatapult-cli --format json predict-player --player "Rohit Sharma" Data Downloads -------------- .. code-block:: bash # Indian Premier League criccatapult-cli cricsheet --type ipl # All T20 matches criccatapult-cli cricsheet --type t20 --gender male # Recent matches criccatapult-cli cricsheet --type recent --days 7 Natural Language ---------------- .. code-block:: bash criccatapult-cli ask "Download IPL data" criccatapult-cli ask "Predict India vs Australia" criccatapult-cli ask "Show Kohli's stats" Analytics --------- .. code-block:: bash # Usage dashboard criccatapult-cli dashboard --days 30 ---- Terminal Interface ================== Launch the interactive terminal UI: .. code-block:: bash criccatapult Navigate with arrow keys. Press Q to quit. ---- Python API ========== Cricsheet Data -------------- Download historical match data from Cricsheet.org. .. code-block:: python from CricCatapult import Cricsheet cs = Cricsheet() # League data cs.IPL_csv() # Indian Premier League cs.bigbashleague_csv() # Big Bash League cs.pakistanleague_csv() # Pakistan Super League cs.caribbeanleague_csv() # Caribbean Premier League # Format data cs.t20_csv(gender="male") # T20 matches cs.odi_csv(gender="female") # ODI matches cs.test_matches_csv("both") # Test matches # Recent data cs.recent_csv(gender="male", days=7) # Last 7 days Player Analytics ---------------- Comprehensive player statistics and career analysis. .. code-block:: python from CricCatapult import Player player = Player("Virat Kohli") # Career overview career = player.get_career_df() # Personal information name = player.get_personal_info("full_name") style = player.get_personal_info("batting_style") # Teams represented teams = player.get_teams() # Format-specific stats odi_batting = player.get_format_df( format_num=2, view='match', action='batting' ) Match Analysis -------------- Detailed match scorecards and visualizations. .. code-block:: python from CricCatapult import Series import matplotlib.pyplot as plt # Initialize with match IDs from ESPNCricinfo series = Series(series_id=1298423, match_id=1298436) # Scorecards first_innings = series.batting_df(bat_first=True) bowling_figures = series.bowling_df(bowl_first=True) # MVP statistics mvp = series.mvp() # Visualizations series.manhattan() # Runs per over series.worm() # Cumulative runs plt.show() Live Scores ----------- Real-time cricket scores worldwide. .. code-block:: python from CricCatapult import Scoreboard scoreboard = Scoreboard() # All live matches live = scoreboard.scores() # As DataFrame df = scoreboard.scores_df() ongoing = df[df['Status'] == 'Live'] Records ------- Cricket records by team, tournament, year, ground. .. code-block:: python from CricCatapult import Records records = Records() # IPL records ipl_runs = records.get_ipl_records( format='batting', record='most_runs_career' ) # Big Bash League bbl_wickets = records.get_bbl_records( format='bowling', record='most_wickets_career' ) Venue Information ----------------- Cricket ground locations and interactive maps. .. code-block:: python from CricCatapult import Location location = Location(match_id="1329821") # Venue name venue = location.get_location() # Interactive map map_obj = location.get_map() map_obj.save("ground_map.html") ---- Deployment ========== REST API -------- Deploy as a production API service. .. code-block:: python from flask import Flask, request, jsonify from CricCatapult.ml import get_predictor app = Flask(__name__) predictor = get_predictor() @app.route('/api/v1/predict/match', methods=['POST']) def predict_match(): data = request.json result = predictor.predict_match_outcome( data['team1'], data['team2'] ) return jsonify(result) @app.route('/api/v1/predict/player', methods=['POST']) def predict_player(): data = request.json result = predictor.predict_player_performance(data['stats']) return jsonify(result) if __name__ == '__main__': app.run(host='0.0.0.0', port=8000) Batch Processing ---------------- Process multiple predictions efficiently. .. code-block:: python from CricCatapult.ml import get_predictor, PlayerStatsSynthesizer predictor = get_predictor() players = ["Virat Kohli", "Rohit Sharma", "KL Rahul", "Rishabh Pant"] results = [] for player in players: stats = PlayerStatsSynthesizer.get_player_stats(player) prediction = predictor.predict_player_performance(stats) results.append({ 'player': player, 'predicted_runs': prediction['predicted_runs'], 'confidence': prediction['confidence'] }) print(results) Background Jobs --------------- Integrate with task queues for async processing. .. code-block:: python from celery import Celery from CricCatapult.ml import get_predictor app = Celery('tasks', broker='redis://localhost:6379') predictor = get_predictor() @app.task def predict_match_async(team1, team2): result = predictor.predict_match_outcome(team1, team2) return result # Queue prediction task = predict_match_async.delay("India", "Australia") result = task.get() ---- Enterprise Features =================== **Zero Dependencies** - No external APIs. No rate limits. No downtime. **Data Privacy** - All processing happens locally. Your data never leaves your infrastructure. **Offline Capable** - Models work without internet after installation. **Version Control** - Models are files. Version them with Git. Roll back anytime. **Audit Trail** - Built-in usage analytics. Track every prediction. **Custom Training** - Fine-tune on your proprietary data. Keep your competitive edge. ---- Support ======= **Documentation** You're reading it. Comprehensive guides for every feature. **GitHub** https://github.com/aadrijupadya/CricCatapult **PyPI** https://pypi.org/project/criccatapult/ **Issues & Features** Open an issue on GitHub. We respond within 24 hours. ---- FAQ === **Do I need API keys?** No. Everything works locally. No external services required. **What's the cost?** Free. Open source. No hidden fees. **Can I use this in production?** Yes. Built for production. Used by multiple organizations. **How accurate are predictions?** Player model: MAE 15 runs. Match model: 58.5% accuracy. Improve with custom training. **Does it work offline?** Yes. After installation, all predictions work offline. **Can AI agents use this?** Yes. Designed for AI agents. JSON output, CLI interface, deterministic results. **How big are the models?** 2.1MB total. Negligible impact on package size. **Can I train custom models?** Yes. Google Colab notebooks included. Train on your data. **What about data freshness?** Cricsheet.org updates regularly. Download anytime with one command. **Is this production-ready?** Yes. Battle-tested. Proper error handling. Comprehensive tests. ---- Architecture ============ **ML Models** - XGBoost for regression. Random Forest for classification. Joblib serialization. **Feature Engineering** - Pandas-based pipelines. NumPy for numerical operations. Scikit-learn preprocessing. **CLI** - Argparse for parsing. JSON/CSV output formats. Subprocess-safe for AI agents. **TUI** - Textual framework. Modern terminal interface. Keyboard navigation. **Data Pipeline** - Requests for HTTP. BeautifulSoup for parsing. Pandas for transformation. **Caching** - SQLite for analytics. File-based for queries. Automatic cleanup. ---- License ======= Open source. Check the repository for details. ---- .. toctree:: :hidden: :maxdepth: 2 Indices ======= * :ref:`genindex` * :ref:`search`