All soccer/football club transfers from 1992/93–2020/21 for 10 of the top European leagues, namely
- Premier League 🏴
- La Liga 🇪🇸
- Bundesliga 🇩🇪
- Serie A 🇮🇹
- Ligue 1 🇫🇷
- Primeira Liga 🇵🇹
- Eredivisie 🇳🇱
- Premier Liga* 🇷🇺
- Jupiler Pro League* 🇧🇪
- Scottish Premiership* 🏴
Data were obtained by web scraping league transfer data from Transfermarkt.
* Transfermarkt does not provide data for the 2011/12 Premier Liga season, the 1992/93 and 1993/94 Jupiler Pro League seasons, or the 1992/93–2002/03 Scottish Premiership seasons.
All data are provided in the data
directory and grouped into season subdirectories.
Feel free to use this dataset for your own purposes!
You can clone it or download it via DownGit.
Consult the README for more information.
If you'd like to pull the raw data directly from the source or scrape data for other countries and leagues, you can use the Python script provided by tmtransfers
.
Clone this repository and open a terminal in the cloned folder. First ensure all dependencies are met:
pip install -r requirements.txt
The module can now be run as a script from the top directory:
python -m tmtransfers
This launches a series of text prompts. You should see the following output to start:
Select currency (default is euro):
[1] EUR €
[2] GBP £
[3] USD $
===>
Follow the prompts to input your desired league parameters.
Scraped data will then be written to CSVs in a created data
directory.
As an example, an output CSV for the Premier League's 2020/21 season with the default options and before cleaning should look like:
club | name | age | nationality | position | short_pos | market_value | dealing_club | dealing_country | fee | movement | window | league | season |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Arsenal FC | Thomas Partey | 27 | Ghana | Defensive Midfield | DM | €40.00m | Atlético Madrid | Spain | €50.00m | in | summer | premier-league | 2020 |
Arsenal FC | Gabriel | 22 | Brazil | Centre-Back | CB | €20.00m | LOSC Lille | France | €26.00m | in | summer | premier-league | 2020 |
Arsenal FC | Pablo Marí | 26 | Spain | Centre-Back | CB | €4.80m | Flamengo | Brazil | €5.00m | in | summer | premier-league | 2020 |
Arsenal FC | Rúnar Alex Rúnarsson | 25 | Iceland | Goalkeeper | GK | €1.20m | Dijon | France | €2.00m | in | summer | premier-league | 2020 |
Arsenal FC | Cédric Soares | 28 | Portugal | Right-Back | RB | €8.00m | Southampton | England | free transfer | in | summer | premier-league | 2020 |
Note: If you run the script again and scrape data for the same league and same season, the existing CSV will be overwritten. Be sure to move or rename existing files if you need them as is before running the script again.
If you'd like to use this module elsewhere, install it from the top directory with
pip install .
It provides two functions, scrape_transfermarkt
and tidy_transfers
.
Use them like so:
import pandas
import tmtransfers
# Web scrape data for a league not explicitly given in the script
# Returns a Pandas dataframe
df = tmtransfers.scrape_transfermarkt(
league_name='championship',
league_id='GB2',
season_id='2005',
write=True)
# Clean the data
# Returns another Pandas dataframe
tidy_df = tmtransfers.tidy_transfers(df)
See the documentation in tmtransfers.py
for more details.
Note: These functions have been tested for only the above leagues through the listed seasons. You'll have to browse Transfermarkt for what to input to scrape other countries and leagues.
All data are scraped from Transfermarkt according to their terms of use.