Winning at Sports Betting: Scraping and Analyzing Odds Data with Python

Daniel Guerrero
5 min readMay 5, 2023
Photo by Christopher Gower on Unsplash

Are you looking for an edge in sports betting? Sports betting can be a lucrative activity, but it requires careful analysis and accurate predictions to generate profits. One way to gain an edge is by analyzing odds data, which can help you identify value bets and find the best value for your bets. However, analyzing odds data comes with its own set of challenges, such as the large volume of data and the need for accurate predictions. In this post, we’ll show you how to scrape and analyze odds data using Python and BetExplorer.com.

You can find this code and other of my projects on my GitHub. You can also follow me on Twitter.

Average Odds

To begin, we’ll show you how to scrape average odds data from BetExplorer.com using Python’s requests and BeautifulSoup libraries. Average odds represent the market consensus on the likely outcome of a match and can help you identify matches where there is a discrepancy between the market and your own analysis.

Firstly, we will define the URL that will use. Then, we will use requests.get to retrieve the HTML data from the page, and BeautifulSoup to parse it. Finally, we will use find to locate the table that contains the odds data and extract it into a Pandas DataFrame.

URL = "https://www.betexplorer.com/soccer/england/premier-league/results/"
response = requests.get(URL)
soup = BeautifulSoup(response.text, 'html.parser')

Using inspect element, we can see that the table we want to extract has the class “table-main js-tablebanner-t js-tablebanner-ntb”, and the primary columns of the table are Match, Result, 1, X,2, and date.

Premier League page using inspect element

We will need to extract this table, create a DataFrame with those columns, and read the rows of the table to store them in the DataFrame. The easiest way to do this is to iterate over all the elements of the table and store the text of the element.

table_matches = soup.find('table', attrs={'class':'table-main js-tablebanner-t js-tablebanner-ntb'})
data = []
rows = table_matches.find_all('tr')
for row in rows:
utils = []
cols = row.find_all('td')
for element in cols:
utils.append(element.text)
data.append(utils)

df = pd.DataFrame(data,columns=["Match","Result","1","X","2","Date"])

However, there are some issues with doing this. Some rows are empty, and the code is unable to locate the odds as shown in the output.

Output of plain code

Upon exploring the HTML code, we found that the odds are stored in an attribute called data-odd, and the winning odd has a box, so we must access it as a span text. To achieve this, we will use a try-except block only in the cases of the odds. Finally, to avoid empty rows, we will only add them if they are not empty. The final code and output will appear as shown below.

data=[]
for row in rows:
utils = []
cols = row.find_all('td')
for element in cols:
try:
# Store the odds that win and didnt win
if 'data-odd' in element.attrs:
utils.append(element['data-odd'])
else:
utils.append(element.span.span.span['data-odd'])
except:
# Store the text
utils.append(element.text)
if utils:
data.append(utils)
df = pd.DataFrame(data,columns=["Match","Result","1","X","2","Date"])
df.head()
Final Output

Odds from different Bookmakers

Next, we will demonstrate how to scrape odds data from all bookmakers for a specific match. This will enable you to compare the odds from different bookmakers and identify the best value for your bets. Let’s use the example of Brighton — Manchester Utd. In this case, the HTML content is dynamically generated using JavaScript, which implies that when we request the URL using requests, we are not getting the fully rendered page. As a result, we will use Selenium to obtain the full HTML code.

URL = "https://www.betexplorer.com/soccer/england/premier-league/brighton-manchester-united/6DkGCi44/"

driver = webdriver.Chrome()
driver.get(URL)
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
driver.quit()

After acquiring the HTML code, the process is almost the same as before. The class of the table in this instance is “table-main h-mb15 sortable”. To make the code shorter, we will use list comprehension and only add the row if it has four elements (Bookie, 1, X, 2).

table = soup.find('table', attrs={'class':"table-main h-mb15 sortable"})
rows = table.find_all('tr')

data = []
for row in rows:
cols = row.find_all('td')
bet = [element.text for element in cols if element.text!='']
if len(bet)==4:
data.append(bet)
bet_odds = pd.DataFrame(data,columns= ['Bookmaker','1','X','2'])
bet_odds
First 5 rows of odds from bookies

Now that we have scraped and filtered the odds data, we can begin analyzing it. In future posts, we will demonstrate how to utilize the data to identify value bets and create a model for predicting match outcomes.

We have implemented this in soccer, but you may utilize this code in the sport of your preference. In our next post, we will show you how to build a machine-learning model to try to beat the bookmakers.

We will guide you through the process step-by-step, using Python libraries like Scikit-Learn. We will also offer tips and tricks for optimizing your model and achieving the highest possible accuracy.

Try out the code for yourself and see how it can help you gain an edge in sports betting.

BECOME a WRITER at MLearning.ai

--

--