AI Free Code


Cricket is one of the most popular outdoor sports that has captured everyone’s heart. There are many series that are held, and the Indian Premier League (IPL) is one of them which has a long and illustrious tradition in the sports world. IPL is a professional Twenty20 (T20) league started in 2008 which was founded by the Board of Control for Cricket in India (BCCI). The IPL is a 20-over league, which means each team plays 20 overs from both sides. Every year, eight teams from eight Indian cities participate in this league. A cricket match is influenced by a variety of factors, and the factors that have a major impact on the outcome of a T20 cricket match are described in this project. IPL Score Prediction project takes several years of IPL data, including player information, match location information, team information, and ball to ball information, and analyzes it to draw different conclusions that help in the enhancement of player’s results. It focuses on calculating the results of IPL matches using data mining techniques on both balanced and imbalanced datasets. In T20 Cricket matches, the first innings score is currently estimated based on the existing run rate, which is measured as the number of runs scored per a number of overs bowled. It includes the following factors:

  • 1. Number of wickets left
  • 2. Number of balls left
  • 3. On how much scores are the current batsman batting?
  • 4. How much has the team scored in the last 5 years?
  • 5. How much did the team have lost wickets in the last 5 overs?
  • 6. The nature of the pitch
  • 7. How strong is the batting and bowling team?


  • 1. Firstly, the data is trained. We will take 15-20% of the data from the data collection to train the model.
  • 2. We will take 15-20% of the data from the data collection to train the model.
  • 3. For the prediction, we will be using a Linear regression algorithm.
  • 4. The project is split into three separate Jupyter Notebooks: one to collect the IPL data, inspect it, and clean it; a second to further refine the features and fit the data to a Linear Regression model to train and evaluate our output.


  • 1. Python
  • 2. Pandas
  • 3. NumPy
  • 4. Matplotlib
  • 5. Jupyter Notebook
  • 6. PyCharm


  • 1. The System must provide the predicted IPL score.
  • 2. The system must have an easy to use interface for the system for all the users.
  • 3. The admin must be able to modify/update the dataset.
  • 4. The dataset of the IPL score must be available for the system.
Figure 01: Block Diagram


Kaggle was used to collect the IPL score data. We took 80% of the data for the train set and the rest of the 20% of the data from the test set. Parameter are:

  • 1. Venue
  • 2. Bat Team
  • 3. Bowl Team
  • 4. Batsman
  • 5. Bowler
  • 6. Runs
  • 7. Wickets
  • 8. Overs
  • 9. Runs last 5
  • 10. Striker
  • 11. Non Striker
  • 12. Total
IPL Score Dataset
Figure 02: IPL Score Dataset


  • 1. Feature Selection: We have a lot of unnecessary attributes in our data that we won’t use in our project. As a result, we only use the attributes that we need.

  • 2. Normalization: The initial step is to normalize the data which we have collected from the internet. Rescaling real-valued numeric attributes into the range between 0 and 1 is referred to as normalization. The data is then normalized after it has been filtered.

  • 3. Machine Learning : The method of iteratively refining your prediction equation through looping over the dataset several times by updating the values of weight and bais in the direction suggested by the slope of the gradient (Cost Function) is known as training a model. We consider training to be complete, when we exceed an appropriate error, or when required training iterations (epochs or cycles) fail to reduce our cost.


Linear Regression is the algorithm used in our project. 

  • 1. Linear Regression:

    Regression is the method that measures the average relationship between two or more continuous variables in terms of the response variable and feature variables. Also, in other words, regression analysis is to know the nature of the relationship between two or more variables to use for predicting the most likely value of dependent variables for a given value of independent variables.

Figure 03: Linear Regression


The IPL score prediction system works properly. All of the attribute values had been preprocessed correctly. The model was applied and trained using training data after all of the preprocessing was done. The Linear Regression model accuracy was found to be 82%. The GUI of IPL score prediction was made with HyperText Markup Language (HTML). The coding was done in Jupyter Notebook and VsCode. After completing all of the processes, we have linked the front-end (HTML) with the back-end (Python).


Linear Regression Algorithm is applied to the IPL dataset which is very essential for improving people’s future performance. Using some selected input variables obtained from Kaggle, we have created a model to forecast the IPL score. The issue with the current IPL dataset is that we are unable to organize ourselves and complete critical tasks. So, this model was created to know the IPL score with high precision when taking into account all of the factors that influence the IPL score.

☺ Thanks for your time ☺

What do you think of this “IPL First Innings Score Prediction using ML Algorithm“? Let us know by leaving a comment below. (Appreciation, Suggestions, and Questions are highly appreciated).


Leave a Reply

Your email address will not be published. Required fields are marked *