Как обучить модель для предсказания угловых / карт?

How to train a model for predicting angles / maps?

Contents

What is a model for predicting corners and cards in football
What data is needed for a model predicting corners and cards in football
API for sports events to obtain statistics on corners and cards
How to collect and prepare data from the API for training a model to predict corners and cards
How to train a machine learning model to predict corners and cards in football
How to evaluate the accuracy of a model predicting corners and cards based on football statistics
How to use a model predicting corners and cards in sports betting and analytics

What is a model for predicting corners and cards in football

A model for predicting corners and cards in football is a formalized algorithm that assesses, based on historical and current match statistics, how many corner kicks and disciplinary sanctions (yellow and red cards) will occur in the game. Such models are most often built in two formats: regression (predicting the exact number of events) and probabilistic (assessing the chances that the total of corners or cards will exceed a given value). A quality model takes into account the teams’ styles, the strength of opponents, the tournament context, and even the dynamics of the match.

In applied tasks, predicting corners and cards is used for several purposes. Club analysts and media platforms use them for advanced statistics and visualizations. Betting companies and professional players compare the model’s predictions with the odds of bookmakers to find undervalued markets. Data providers and sports startups integrate such calculations directly into their products, using a reliable stream of match data, for example, through the sports events API available in the service. api-sport.ru.

It is important to understand that the model does not «guess» the future randomly. It relies on large volumes of structured data: detailed match statistics, live events, information about tournaments and teams. The richer and cleaner the source data, the more stable and accurate the predictions. Therefore, the key technical foundation of any model predicting corners and cards is a reliable source of statistics with clear documentation and a predictable response format.

What data is needed for a model predicting corners and cards in football

To build a working model for predicting corners and cards, a historical array of matches with detailed post-match statistics is required. The minimum level includes the total number of corners for both home and away teams, the number of yellow and red cards, the number of fouls and offsides. However, to increase accuracy, it is worth adding more detailed indicators: ball possession, shots on goal, the number of dangerous attacks, crosses, and actions in the final third. In many APIs, these indicators are available through the extended match statistics block, where each metric has its own key.

In addition to the events themselves, the context is critically important for the models. It is desirable to have information about the tournament and its level, the current round, the home or away status of the team, the date and time of the match, the city, and the stadium where the match is held. It is useful to add team form (results of recent matches), average values of corners and cards for the season, foul frequency, and the playing style of opponents. Such data allows the algorithm to understand that, for example, a derby in the upper part of the table and a match of outsiders at the end of the season generate different profiles in terms of discipline and corners.

Special attention should be paid to live data if you plan to predict corners and cards in real-time. In this case, the current minute of the match, the score, information about already shown cards and the number of corners, as well as the flow of events (fouls, dangerous attacks, red cards) will be needed. In sports event APIs, this data is usually presented as an array of live events with event types and a separate block of statistics by match periods (ALL, 1ST, 2ND). It is the combination of pre-match and live statistics that makes the model truly practically useful.

API for sports events to obtain statistics on corners and cards

To train the model for predicting corners and cards, the first step is to organize a stable data collection. This is facilitated by Sports events API, which provides structured information about football matches via HTTP requests. The basic endpoint for football in the documentation looks like /v2/football/matches and returns a list of matches with the field matchStatistics, where key metrics are contained: ball possession, shots, fouls, corners (key угловые удары), yellow cards (key yellowCards) and other indicators.

To obtain complete statistics for a specific match, the endpoint is used. /v2/football/matches/{matchId}. In the response, a match object is returned, including an array matchStatistics for different periods (ALL, 1ST, 2ND), as well as an array liveEvents. Through liveEvents you can extract the chronology of cards, as each card comes as an event with a type card, team (home/away), and time in minutes. If necessary, to detail events, you can also use the path /v2/football/matches/{matchId}/events, which returns the full log of events separately from the main match information.

Below is an example of a simple request in Python that calls the API, retrieves matches for a specific date, and extracts basic statistics on corners and yellow cards. An API key is used for authorization, which can be obtained at the personal account. after registering on the platform.

import requests
API_KEY = "ВАШ_API_КЛЮЧ"
BASE_URL = "https://api.api-sport.ru/v2/football/matches"
params = {
    "date": "2025-09-03"  # нужная дата в формате YYYY-MM-DD
}
headers = {
    "Authorization": API_KEY
}
response = requests.get(BASE_URL, params=params, headers=headers)
response.raise_for_status()
data = response.json()
for match in data.get("matches", []):
    stats_all = next(
        (s for s in match.get("matchStatistics", []) if s.get("period") == "ALL"),
        None,
    )
    if not stats_all:
        continue
    corners = None
    yellow_cards = None
    for group in stats_all.get("groups", []):
        for item in group.get("statisticsItems", []):
            if item.get("key") == "cornerKicks":
                corners = (item.get("homeValue"), item.get("awayValue"))
            if item.get("key") == "yellowCards":
                yellow_cards = (item.get("homeValue"), item.get("awayValue"))
    print(match["id"], "угловые:", corners, "желтые карты:", yellow_cards)

How to collect and prepare data from the API for training a model to predict corners and cards

After you have mastered the basic endpoints, the next step is bulk collection of historical data. This is usually done by season or tournament: using category and tournament endpoints, you get a list of competitions and seasons, and then for each season, you collect all matches through /v2/football/matches with filters tournament_id, season_id or by date range. The goal is to create a table where each row corresponds to a match, and the columns contain features and target variables, such as the total number of corners and cards by teams.

At the preparation stage, it is important to bring the data to a unified format. Statistical indicators from matchStatistics need to be carefully unpacked: for each key of interest (for example, угловые удары, yellowCards, fouls, totalShotsOnGoal, ballPossession) create separate numerical fields. If you are using live events from /events or liveEvents, then for each card, additional features can be calculated: the minute of the first yellow card, whether there are any red cards, the number of cards before the 60th minute, and so on. The final dataset is conveniently stored in a database or in CSV/Parquet format, so it can be quickly loaded into machine learning tools later.

Below is an example of a Python script that iterates over a list of match IDs, creates a template for the future training set, and saves it in CSV. In a real project, you would supplement it with loops over seasons, logging, and error handling, but the basic structure will remain similar.

import csv
import requests
API_KEY = "ВАШ_API_КЛЮЧ"
BASE_URL = "https://api.api-sport.ru/v2/football/matches/{}"
HEADERS = {"Authorization": API_KEY}
match_ids = [14570728, 14586240]  # список заранее собранных ID матчей
with open("matches_corners_cards.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow([
        "match_id", "tournament_id", "season_id",
        "home_corners", "away_corners",
        "home_yellow", "away_yellow",
        "home_fouls", "away_fouls",
        "home_shots", "away_shots",
    ])
    for mid in match_ids:
        resp = requests.get(BASE_URL.format(mid), headers=HEADERS)
        resp.raise_for_status()
        match = resp.json()
        stats_all = next(
            (s for s in match.get("matchStatistics", []) if s.get("period") == "ALL"),
            None,
        )
        if not stats_all:
            continue
        metrics = {"cornerKicks": (0, 0), "yellowCards": (0, 0), "fouls": (0, 0), "totalShotsOnGoal": (0, 0)}
        for group in stats_all.get("groups", []):
            for item in group.get("statisticsItems", []):
                key = item.get("key")
                if key in metrics:
                    metrics[key] = (item.get("homeValue"), item.get("awayValue"))
        writer.writerow([
            match["id"],
            match["tournament"]["id"],
            match["season"]["id"],
            *metrics["cornerKicks"],
            *metrics["yellowCards"],
            *metrics["fouls"],
            *metrics["totalShotsOnGoal"],
        ])

How to train a machine learning model to predict corners and cards in football

After preparing the dataset, you can proceed to model building. In practice, it is convenient to divide the task into several subtasks. For corners, regression models are often used that predict the number of corners for each team or the overall total. For cards, both regression (number of yellow and red cards) and classification work well: the probability that there will be more than a certain threshold of cards in the match (for example, total more than 4.5). Typical algorithms include linear and logistic regression, random forest, gradient boosting (XGBoost, LightGBM), as well as neural networks if the data volume is large.

Before training, the dataset is split into training and testing samples considering time: it is important that the model learns from older matches and is tested on more recent ones, otherwise there will be a «leak» of future information. Numerical features are normalized or standardized if necessary, and categorical ones (tournament, country, tactical scheme) are encoded using one-hot or target encoding. As target variables, you can use both exact values (for example, the total number of corners for home and away teams) and binary labels (total more/less than 9.5 corners, presence of a red card, etc.).

Below is a simplified example of training a regression model to predict the total number of corners in a match using the scikit-learn library. It assumes that you have already created a table with features and a target column. total_corners based on the data uploaded from api-sport.ru.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error
# Загрузка подготовленного датасета
data = pd.read_csv("matches_corners_cards_features.csv")
FEATURES = [
    "home_avg_corners", "away_avg_corners",
    "home_avg_fouls", "away_avg_fouls",
    "home_avg_yellow", "away_avg_yellow",
    "home_shots", "away_shots",
]
TARGET = "total_corners"
X = data[FEATURES]
y = data[TARGET]
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, shuffle=False  # временное разбиение: обучение на старых матчах
)
model = RandomForestRegressor(
    n_estimators=300,
    max_depth=8,
    random_state=42,
    n_jobs=-1,
)
model.fit(X_train, y_train)
preds = model.predict(X_test)
mae = mean_absolute_error(y_test, preds)
print("MAE по тоталу угловых:", mae)

How to evaluate the accuracy of a model predicting corners and cards based on football statistics

Model evaluation is a critical stage that shows whether the forecasts can be trusted in real conditions. For numerical purposes (number of corners, number of cards), MAE (mean absolute error) and RMSE (root mean squared error) metrics are popular. They measure the average prediction error in «events»: how much the model typically errs in the number of corners or cards. For probabilistic models that output the chances of exceeding a certain total or receiving a red card, ROC-AUC, logloss, and Brier score are usually used. These metrics help understand how well the model ranks outcomes and how calibrated its probabilities are.

It is important to evaluate the model not only by global metrics but also by stability over time and segments. For example, one can check the quality separately for top leagues and lower divisions, for matches of favorites and outsiders, by seasons. Another mandatory step is a pseudo-backtest: reproducing the chronological sequence of matches, where at each step you use only the information that was available at the time of the match. This is especially important if you plan to use live data and WebSocket subscriptions for updating probabilities in real time in the future.

Below is an example of calculating basic metrics for a model predicting the total number of corners and a binary model assessing the probability that more than 4.5 cards will be shown in a match. Such analysis will help understand how suitable the model is for practical use in analytical panels or integration with the bookmaker line through the field oddsBase in API responses.

from sklearn.metrics import mean_absolute_error, mean_squared_error, roc_auc_score, brier_score_loss
import numpy as np
# y_true_corners, y_pred_corners — фактический и предсказанный тотал угловых
# y_true_cards_binary, y_pred_cards_proba — факт (0/1) и вероятность тотала карт > 4.5
mae = mean_absolute_error(y_true_corners, y_pred_corners)
rmse = mean_squared_error(y_true_corners, y_pred_corners, squared=False)
auc = roc_auc_score(y_true_cards_binary, y_pred_cards_proba)
brier = brier_score_loss(y_true_cards_binary, y_pred_cards_proba)
print(f"MAE угловых: {mae:.3f}")
print(f"RMSE угловых: {rmse:.3f}")
print(f"ROC-AUC по тоталу карт > 4.5: {auc:.3f}")
print(f"Brier score по картам: {brier:.3f}")

How to use a model predicting corners and cards in sports betting and analytics

The practical application of the corner and card prediction model directly depends on the quality and relevance of the data. In sports analytics, such models are embedded in dashboards and reports: they show the expected number of corners and cards in upcoming matches, build probability distributions for totals, and visualize team style comparisons. For media and fan platforms, this is an opportunity to offer the audience deeper statistics than just the score and expected goals. Within clubs and academies, such calculations help analyze team discipline, trends in fouls, and risks of expulsions in specific game scenarios.

In betting and risk management, the model serves as the basis for assessing «fair» odds. By comparing the predicted probabilities with the bookmaker line obtained through the field oddsBase in API responses, discrepancies can be found and value bets can be determined for the corners and cards markets. Bookmaking companies and traders use such models for automated odds setting and dynamic line adjustment based on live data. Here, live endpoints and the development of infrastructure towards WebSocket, as well as the use of AI modules, are particularly useful, which the platform api-sport.ru plans to expand.

Technically, the integration looks like this: your application periodically requests data about upcoming and current matches via the API, substitutes it into the trained model, and saves the results in its own database or caches them. Then, on top of these predictions, you build an interface: a feed of matches with expected corners and cards, highlighting potentially «hot» games, signals for traders. To launch such a contour, it is enough to arrange access and obtain a key at the personal account., after which you can connect the corner and card prediction model to any product — from an internal analytics dashboard to a public betting service.